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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



725-726 738 
838 844 857- 
928 963 976 
1050 1116- 
1226 1243 
1339 1353 
1447 1504 
1631 1641 
1691-1692 



621 626 649 679 719 
793 803 831 834-836 
858 866 879 905 913 
1005-1006 1012 1038 
1117 1151 1199 1204 
1265 1274 1324-1325 
1374 1377 1440-1441 
1549 1600 1618-1619 
1644 1653 1687-1688 
1741 1771 



ALV001 



young liver 



GIBCO 



5-8 11 20-21 46 
75 79 82 93 97 
116 139 143-144 
174 187-189 194 
215 230 250 258 
306 309 342 351 
374 392 394 398 
414 431 444 455 
493 510-512 516 
549 571 574-577 
607 621-624 628 
648 660 666-667 
717 719 728 730 
766 770 773 779 
814 841 849-851 
893 898-900 902 
919 922 924 934 
970 984 986 997 
1012 1029-1030 
1061 1066 1070 
1093 1099-1102 
1117 1119 1121 
1144-1145 1156 
1199-1200 1209 
1241 1244 1262 
1283 1295 1317 
1344 1359 1362- 
1384 1403 1415 
1450 1467 1475- 
1494-1495 1498 
1518-1519 1526 
1552 1557-1559 
1597 1609 1614 
1641 1644 1654- 
1669 1684 1691- 
1725 1738 1741 
1760-1761 1763- 



50-51 58 65-66 
102-103 108 110 
148-149 171-172 
195 198 209 214- 
267-269 280-281 
356 359 362 372 
401 407-408 410 
459 476 478 483 
520 522 526 536 
585 592 601-602 
-630 632-633 637 
678 697-698 700 
734 738 744-745 
788 800 808 812 
871 874 879 887 
904 906-907 911 
953 957 963 965 
1001 1004 1007 
1033-1034 1052 
1076 1086 1089 
1110-1112 1116- 
1125 1136-1137 
1157 1159 1196 
1211 1219-1220 
1270 1275 1279 
1320 1332 1339 
1363 1379 1383- 
1430-1431 1437 
1476 1483-1484 
1505 1512 1516 
1529 1547 1550- 
1565 1583 1587 
1620 1631 1637 
1655 1662 1667 
1692 1702 1711 
1743-1744 1758 
1765 1769 



5-8 17 20-21 32-33 41 55 58 64 
75 77 86 89 102 108 117 119 175- 
176 198 200 209 231 235-236 250 
272 275-276 284 306 316 321 325 
333 356 359 374 376 398 401 408 
414 428 430 433-435 454 476 494 
503-505 517-518 528 534 544 552 
561-563 567 578 581 608-609 630 
632 637 644 650 661 665 672 702 
707 710 721-722 750 753 778 782 
794 814 820 826 834-837 847 849- 
850 858 861 874 879 893 89B 904 
911 918 921-922 926 946 948 972 
978 986 996 1020 1027 1031 1034 
1053 1063 1068 1070 1073 1086 
1089 1093 1097 1113 1119 1156 
1159 1195 1198-1199 1208 1220 
1227 1241 1261 1272-1273 1277 
1285 1308 1315 1320 1324-1325 
1330 1362-1363 1375 1403 1408- 
1409 1415 1431-1432 1435 1467 
1469 1482 1504 1524 1542 1547 



adult liver 



Invitrogen 



ALV0 02 
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Tissue Origin 



RNA Source 



Hyeeq 
Library Name 



SEQ ID NOS: 



1550 1567 1578 
1597 1601-1602 
1618-1619 1621 
1647 1652 1654- 
1669-1671 1684 
1738 1742-1744 
1765 1772 1774 



1581 1583 1594 
1611-1612 1615 
1625 1637 1645 
16S5 1660 1666 
1706 1722 1737- 
1760-1761 1763- 



adult liver 



Clontech 



ALV0 03 



29 676 997 1063 1119 1536 1766 



adult ovary 



Invitrogen 



AOV0Q1 



1 4-18 20-23 29 35-40 42-48 50- 
51 53-58 61-63 65-66 68-69 73-75 
77-78 80 82 85 87 89 97 100-101 
103-104 106-108 110 113 115 118 
122-124 126 128 133-134 136-140 
142 145-147 149-157 161 166 168- 
170 174 177-17B 180 182-186 188- 
189 192-203 207 209 211-215 219 
221-224 229-230 234 242-243 246- 
247 255 258 260-262 265-269 271- 
272 274 277-281 284-286 288 290 
295 299 301-302 304 307 309-311 
313-314 316 321 323-326 330 332- 
333 335-338 341 344 349 352-353 
356 358 360 362 370-372 376-377 
379-384 387 390-3S2 394 397-398 
400 403 408-410 412 414-416 423- 
424 426-427 430-435 439 443-446 
448-449 451 453-455 462-463 468- 
471 473 476-479 481-484 487 489- 
494 496-497 499-501 503-505 509- 
514 516-517 519-520 522 524 526 
528-534 541-544 546-547 549 552 
554-555 561-564 566-567 569-570 
572-573 575-576 579 581 583 585- 
58B 590-591 593 595 597 599 601- 
605 607-613 615 618-622 624-627 
630 632-633 636-640 642 644-647 
649-652 654-655 657-665 667-675 
677-678 681 683-684 692-695 697- 
710 714-721 723 725-727 729 732 
734-735 743-746 750-751 753 758 
763 765 767 772-773 775-778 780 
783-784 786 78e 790-791 794-796 
800 803 805 809-811 813-815 818- 
819 821-824 826 828-829 831-832 
837-838 843-850 852-857 859-864 
867 869 871-872 874-875 878-883 
887-888 890-895 898-910 912-914 
916 919-922 924 926-927 929-939 
941 943-946 948-951 953 955-958 
961-964 966-967 970-979 981-982 
985-986 988-990 992 995-997 999- 
1001 1004-1009 1011-1013 1016 
1019-1020 1024-1025 1029-1031 
1033-1035 1037 1039 1041-1047 
1050-1051 1054-1060 1062-1064 
1067-1070 1072-1073 1075-1076 
1078-1079 1085-1086 1089-1090 
1094-1096 1098-1103 1106-1108 
1112-1117 1119-1120 1123-1127 
1131-1135 1142-1143 1146-1149 
1153 1156 1158 1163 1165-1166 
1169-1171 1173-1175 1177-1178 
1180 1183-1185 1190-1191 1195 
1197-1200 1202 1205-1214 1217- 
1219 1221-1226 1232-1235 1238- 
1241 1243-1244 1247 1249 1252- 
1254 1256-1258 1262 1265 1267- 
1268 1270 1275 1278 1280-1283 
1286-1289 1291 1293-1294 1298- 
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Tissue Origin [ RNA Source 



Hyaeq 
Library Name 



SEQ ID NOS: 



1299 


1306 


1308 


1312 


1317- 


-13 21 


1323 


1327 


1329- 


-1330 


1332- 


-1333 


1338- 


- 1339 


1341 


1343 


-1351 


1356 


1359 


1361 


1365- 


-13 66 


1371- 


-1375 


1377- 


-1379 


1383- 


-13 84 


1386 


1389 


1394 


1400 


1404 


1416 


-1417 


1422- 


1427 


1429- 


1431 


1435 


-1436 


1439- 


1443 


1445- 


1450 


14 53 


-1454 


1459 


1463- 


-1464 


1466 


14 68 


1470 


1474- 


14 81 


1484- 


1485 


14 88 


1491 


1493- 


1494 


1496- 


1498 


1501 


-1504 


1506- 


1507 


1511- 


1517 


1519 


1521- 


1524 


1526 - 


-1527 


1530- 


-1531 


1534- 


-1536 


1538 - 


-1539 


1541 


1546 


1548- 


-1550 


1553 


1555- 


1559 


1DD1 


-1563 


1566- 


130 1 


1569-1570 


1 D / Z 


1574- 


-1575 


lota 


1580- 


1581 


1587 


-1588 


1590- 


1591 


1595 


1597- 


-1598 


1600-1606 


1609 


1611- 


1621 


1623 


•1630 


1634 


1636 


1S38 


1641 


1643 


1645 


1647- 


1657 


1659- 


1662 


1664 


1667 


1669- 


1671 


1673- 


1674 


1676 


-1681 


1683- 


1690 


1699 


1702- 


-1707 


1710- 


1711 


1713- 


•1714 


1716- 


-1719 


1723- 


1724 


1726- 


-1728 


1731- 


-1733 


1735 


1737- 


1738 


1740- 


1741 


1743- 


-1744 


1748- 


1751 


1753 


1755- 


1756 


1760- 


1762 


1765 


1767- 


1768 


1770- 


-1771 


1776 


1778- 


1779 


1783- 


•1784 


1786 





adult placenta 



Clontech 



APLC01 



5-8 44-45 90-91 107-108 159 178 
311 351 414 476 503 545 574 624 
636 719 755 773 860 890-891 924 
947 955-956 962 990 992 1002 
1045 1202 1320 1369 1628 1686 
1713-1714 1743-1744 



placenta 



Invitrogen 



APL002 



adult spleen 



14-16 26 29 43 60-6 
106 116 135 171 177 
198 210 216 235-236 
309 329 334 339 359 
423 430 434-435 448 
491 517 522 631 723 
738 746 769 818 843 
858 916 948 953-954 
1005-1006 1013 1033 
1068 1070 1086 1139 
1160 1277 12B5 1317 
1345 1429 1435 1438 
1486 1490 1512 1519 
1592-1593 1602 1626 
1664 1673 1675 1722 
1746 1776 



1 79-80 103 
180 194 196 
272 290 299 
379-380 417 
454 483 490- 
725-726 728 
854-855 857- 
976 988-989 
1036 1064 
1144-1145 
1320 1343 
1454 1482 
1532 1549 
1647 1649 
1727 1730 



GIBCO 



AS POO 1 



3 5 


-8 12 15 


-16 


19-21 24 


29 


34-36 


44-45 57 60 


82- 


B3 87 89 


94 


98-99 


103 


106 


108 


117 


119 


-121 


139 


141 


147 


152- 


-153 


155 


166 


169 


171 


174 


178 


-180 


196 


198 


201 


-206 


209 


-211 


215 


219 


234 


253 


-254 


256 


258 


264 


272 


280- 


-281 


290 


295 


302 


309 


312 


325 


333 


341 


349 


358 


372 


382 


386- 


387 


394 


406 


414 


431 


434- 


•436 


446 


448 


451 


4 73 


481 


490- 


-493 


500 


503 


505 


517 


519 


530 


534 


536- 


-540 


547 


554 


557 


574- 


-576 


582 


592 


595 


604 


611- 


-612 


620- 


-621 


623 


631- 


632 


642 


652 


659 


661 


667 


671 


673- 


675 


684 


700 


721 


728 


730 


732 


738 


742 


-744 


746 


762 


765 


774 


780 


788- 


789 


794 


810- 


-811 


817 


822 


830 


832 


845 


848 


852- 


853 


858 


862 


866 


874 


879 


882 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



884 906-908 912 919 
927 934 942 949 957 
978 983 990 992-994 
1005-1007 1010 1012 
1042-1044 1046 1049 
1070 1076 1089-1090 
1109 1113 1115 1124 
1170 1174 1177 1190 
1220 1226-1227 1229 
1246 1258 1269 1271 
1301 1320 1322 1330 
1339 1349 1351 1353 
1364 1369 1374 1386 
1417 1434 1436-1437 
1474 1477 1480 1485 
1512 1522 1525 1544 
1560 1567 1591 1600 
1651 1654-1655 1658 
1674 1678-1679 1684 
1727 1733 1738 1740 
1761 1774 1779 1781 



921-923 926- 
958 963 977- 
99€-997 999 
1031 1036 
1059 1068 
1094 1103 
1140 1163 
1196 1219- 
1236 1241 
1274 1295 
1334-1335 
1359-1360 
1397 1413 
1439 1468 
1487 1498 

-1549 1553 
1631 1636 
1662 1670 
1686 1700 

-1741 1760- 

-1782 



testis 



GIBCO 



ATS001 



5-8 10 26 30-31 47 
69 82 84-85 97 102 
139 ISO 152 154 156 
176-177 192 194 196 
227-228 247 255 258 
288-289 301 307 311 
349 370-372 392 398 
427 430-431 433 437 
469 473 477 481-482 
503 513 522 526 547 
564 572-573 575-576 
599-602 605 612 615 
637 647 649-650 656 
674-675 712 719-721 
738 744 746 773 7B0 
802 804 809 811 814 
843 845 848 859 866 
913 916 919 921 926 
960 963 971 975 977 
993 1007 1016 1029 
1035 1038-1039 1045 
1064 1070 1072-1073 
1097 1099-1102 1104 
1141 1149 1161-1162 
1209 1222 1227 1229 
1238-1239 1243 1253 
1289 1291-1293 1307 
1320 1330 1332 1338 
1373-1374 1379 1389 
1409 1423-1424 1430 
1443 1459 1484 1486 
1496-1497 1501 1505 
1527 1530-1531 1533 
1549 1563 1565 1567 
1577 1586 1591 1599 
1628 1630-1632 1636 
1649 1661-1662 1666 
1675 1684 1690 1699 
1717 1724 1730 1737 
1767 1779 



50-51 57 68- 

113 119 137 
163 169 174 

-197 212-215 
261 282 285 
316 330 334 
410 415 426- 
446 454 461 
493 499 502- 
552-553 563- 
581-582 585 

-617 620 631 
660 665 670 
723 728 731 
784 78.8-789 
826 831 837 
869 877 905 
929 937 950 
981 990 992- 

1030 1034- 
1059-1060 
1087 1089 
1108 1113 
1175 1208- 
1231 1235 
1285 1287- 
1311 1317- 
1345 1369 
1399-1400 
1435-1437 
1490 1493 
1509-1513 
1537 1546 
1569 1571 
1602 1625 
1639 1642 
1667 1670 
1705 1712 
1738 1752 



Genomic DNA 
from BAC 63118 



Research 
Genetics 
(CITB BAC 
Library) 



BAC001 



686 1352 1412 



Genomic DNA 
from BAC 39316 



Research 
Genetics 
(CITB BAC 
Library) 



BAC002 



1411-1412 
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Tissue Origin 



RNA Source 



Research 
Genetics 
(CITB BAC 

Library) 



Hyseq 
Library Name 



SEQ ID NOS ; 



Genomic DNA 
from BAC 39316 



BAC003 



1352 



adult bladder 



Invitrogen 



BLD0 01 



5-8 17-18 22-23 33 37-39 56-57 
80 93 100 120-121 169 201 237 
251-252 272 278 311 348 363 382 
413 415 424 430 443 483 502 542- 
543 562 564 607 616-617 626 635 
652 667 671 710 727 755-756 762 
773 786 788 837 840 866 893 898 
909 918 929 966 977 963 1016 
1025 1055 1073 1082 1140 1167 
1185 1189 1199 1270 1369 1481 
1536 1560 1573 1596 1614 1636- 
1637 1649-1S50 1654-1655 1658 
1669 1671 1690 1719 1727 1731- 
1732 1739 1741 1760-1761 1779 



bone 



Clontech 



BMD001 



3-8 11 13 13 29-31 33 35-36 40 
43-45 47-48 50-51 57 60 65-66 75 
80 82 85 88-89 94 100 103 107 
110 115 118-119 124-125 133-134 
136-137 139-141 146 150 152-153 
155 161 163 168-170 172 178-180 
187 192-193 197-198 203-205 210- 
213 215 217 219 222 224-226 233 
235-237 242-244 255 258 260 263- 
264 266 273 276 278 283 286 290 
295 301-302 307 312-313 321 330 
333 339 343 352 357-358 370-371 
382 384-385 387 389 394 408 410 
412 416 421 424-427 429-431 436- 
437 439 441-442 445 447 454-456 
461-462 471-472 475 477-479 481- 
482 485 488 493 498 500 503-506 
513 516 519 523-524 526 530 535- 
540 542 544-545 549 555 565 567 
569-577 581 583-586 588 593 601 
603-604 608-609 613-619 621-622 
632-633 636-637 642 649-650 656- 
660 666 670 672 674-675 679 683 
701 708 716 718-720 731 735-736 
740-742 744-745 752 761 765 772- 
773 775-778 780 785-786 789-791 
796 798 802 810-812 823-824 826 
830 832-833 837-838 843-844 848- 
855 858-859 866-867 869 878-880 
883 890-892 896 903 905 908 912- 
914 922-924 927 930-931 937 939- 
941 952-953 955-958 963 969 973 
976 981 985 987 990 992 995 1000 
1002 1005-1007 1013 1016 1025 
1028-1031 1033 1035 1037 1039 
1D42 1044 1047 1050 1053-1054 
1059 1061 1063 1066 1070-1071 
1079 1106 1110-1113 1115-1117 
1124 1126 1134-1135 1142 1144- 
1145 1163 1172 1178 1197 1199- 
1200 1202 1216-1217 1224 1227- 
1228 1240 1246 1254 1261 1266 
1270 1278 1281 1285 1287 1290- 
1291 1293 1299-1301 1308 1314 
1317-1320 1327 1331 1339 1343 
1346 1349 1353 1356 1361 1367 
1369 1372-1374 1379-1380 1394 
1400 1403 1406 1408 1413 1417 
1419 1423 1425-1427 1430-1431 
1433 1439 1443 1446-1449 1459 
1463-1464 1482 1486 1493-1494 
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rissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ 10 NOS: 



1506 


1509 


1513 


1521- 


1522 


1524 


1526 


1528 


1531 


1536- 


-1537 


1543 


1546 


1548- 


1549 


1552 


1554- 


-1555 


1557- 


1559 


1571- 


1572 


1581 


1589- 


1592 


1597- 


-1600 


1609 


1614 


1621 


1626- 


1628 


1630- 


-1632 


1634 


1636 


1638- 


•1639 


1641 


1646- 


-1647 


1651 


1653- 


•1655 


1661- 


-1662 


167S 


-1681 


1684 


1686 


1690 


1702 


1707 


1711 


1713- 


-1714 


1717 


1720 


1722 


-1723 


1727 


1737 


-1738 


1740 


1758 


1767 


1772 


1781 


-1782 


1785- 


-1786 





bone marrow 



Clontech 



BMD002 



bone marrow 



bone marrow 



11 15-16 19 30-31 35-36 68-69 75 
83-84 93 99 103 108-109 118 137 
139 169-170 174 177 180 190 193 
212-213 219 222 225-226 232 237 
255 259 264 273-274 284 286 290- 
292 295 301 303-304 307 312-313 
316 324 326 330 334-335 348 352- 
353 357 360 370-373 384 386-387 
397 403-404 414-416 421 425-427 
429-430 433-436 440 444 451 454 
465-466 472 475 478 491 493 516 
520 S23 525 531 545 548 552 566 
569-570 581 583 590-591 597-598 
601 616-617 621 641 650 652 656 
659 671 674-675 679 684 710 718- 
719 728 734 737-738 742 761 765 
774-778 790 811 814 818 830 834- 
836 854-855 859 866 869 871 878- 
879 884 889 892 904 922-923 932 
990 992 998 1001 1004 1016 1036 
1042 1048 1051 1054-1055 1058 
1080-1089 11C6 1112-1114 1155 
1157 1192 1200 1223 1227-1228 
1236-1237 1260-1261 1282-1283 
1285 1287 1295 1314 1317-1321 
1324-1327 1330 1333 1341 1343 
1347 1350 1353 1355-1357 1367 
1369-1370 1373 1377 1379 1381 
1383-13B4 1394 1397 1400 1406 
1413 1417 1425-1427 1438 1442 
1446 1459-1460 1470 1493 1505 
1521 1536 1546-1549 1560 1573- 
1574' 1578 1598-1600 1621 1626 
1631 1634 1646 1649 1653 1656 
1658 1669-1670 1683-1684 1687- 
1688 1690-1693 1696 1699 1702 
1704 1707-1709 1711 1720 1722- 
1723 1725 1727 1729 1731-1733 
1738-1740 1743-1746 1752 1755 
1760-1761 1767 1777 1781-1782 
1786 



Clontech 



Clontech 



BMD0 04 



73-74 503 922 1036 1711 



BMD007 



95-96 866 1320 1475 



adult colon 



Invitrogen 



CLN001 



17 56-58 103 110 117 144 150 171 
179 185 188-189 201 204-206 210 
218-221 225-226 231 237 251 277 
288 310 312 320 333 359 386 386 
394 408 420 455 481 485 503 510- 
512 590-591 615 635 647-648 665 
672 684 697 710 725-726 743 780 
786 788 826-827 848-850 854-855 
858 866 872 898 918 921-923 953 
976 983 993 1005-1006 1017 1020 
102S 1027 1054-1055 1063 1068- 
1069 1140 1153 1170 1185 1196 
1199 1220 1280 1314-1315 1320 
1345 1351 1355 1369 1428 1439 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1462-1464 1512 1556 1583 1587 

1594 1596 1614 1625-1626 1631 

1639 1645 1650 1675-16-77 16B7- 

1688 1701 1713-1714 1724 1740 
1765 



Mixture of 16 
tissues - 
mRNAs 



Various 
Vendors 



CTL016 



401 1490 1686 



Mixture of 16 
tissues - 
mRNAs* 
adult cervix 



Various 
Vendors 



CTL021 



312 782 1132-1133 1403 1712 1715 



BioChain 



1 4-8 11 13 18-21 25-26 30-31 33 
37-39 43 46-47 58 61 64-66 71 
73-74 82 85 94 100 103-104 113 
118 122 126 130 134 140 147 153- 
156 163 170 179 181 186 192 195- 
196 198 201-202 218-219 222 229- 
231 257 266 276-277 285-286 288 
298 301-302 304 307 312-314 324 
326 329-330 332 335 342 352 358 
362 371-372 376 379 381-382 384 
388 398 400 410 414 416 419-420 
426-427 430-431 433-436 439 446 
448 461-462 464 471-477 479 482- 
483 491 493 496 503 506 510-513 
516-517 526 530 535 542-544 546- 
547 557 561 572-573 575-577 581- 
582 585-586 588-589 593-594 600 
602 604-605 607-609 612 615-619 
623 644 650 654 657-658 662-665 
670 672 680 683 691-694 696 706 
708-709 711 713 720-721 727 729 
731-732 737 745-747 753-754 760 
765 771 774-777 780 790 793 796 
798 800 803 805 818 826 828 831- 
832 834-836 843 847-848 851-855 
857-860 864-866 869 871 876 878- 
880 882 887 890-891 897 899-902 
905-908 912-913 916 918-919 922 
927 932 934-933 944 948 955-956 
958 963-964 967 969-970 972 976 
978-979 983 985 990 992 1000 
1005-1007 1016-1017 1024 1027 
1033 1036 1038 1045 1047 1053- 
1056 1066-1067 1071 1073 1075 
1079 1082 1098 1113 1124 1129 
1134 1139 1146-1149 1163 1167 
1170 1173 1175 1177 1181 1197 
1200 1202 1211 1214 1216 1221- 
1222 1225 1227 1232-1234 1240- 
1241 1243 1258 1264-1265 1268 
1270 1279 1287-1290 1308 1310- 
1311 1316 1320 1323 1327 1345 
1349 1353-1354 1360 1372-1374 
1383-1384 1386 1394 1397 1405- 



CVX001 



' The 16 tissue-mRNAs and their vendor source, are as follows: 1) Normal adult brain 
mRNA (Invitrogen), 2) normal adult kidney mRNA (Jnvitrogen), 3) normal adult liver 
mRNA (Invitrogen), 4) normal fetal brain mRNA (Invitrogen), 5) normal fetal kidney 
mRNA (Invitrogen), 6) normal fetal liver mRNA (Invitrogen), 7) normal fetal skin mRNA 
(Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) human bone marrow mRNA 
(Clontech), 10) human leukemia lymphablastic mRNA (Clontech), 11) human thymus 
mRNA (Clontech), 12) human lymph node mRNA (Clontech), 13) human spinal cord 
mRNA (Clontech), 14) human thyroid mRNA (Clontech), 15) human esophagus mRNA 
(BioChain), 16) human conceptional umbilical cord mRNA (BioChain). 



119 



WO 01/53312 



PCT/US00/34263 



Tissue Origin 


RNA Source 


Hysecj 




SEQ 


ID NOS: 
































1406 1416 1425- 


1427 


14 j 1 


1436- 








1437 1442 1446 


14 48 


14 53 


1459 








1466 1472 1478 


14 82 


14 96 


1501- 








1503 1506 1512 


1522 


1527 


-1528 








1531 1533 1541 


154 7 


1569 


1571 








1585 1589 1597- 


1598 


1600 


1608- 








1609 1614-1616 


1620 


1623 


-1624 








1626-1628 1630 


1638 


1641 


1643 








1649 1653 1656 


1662 


1667 


1669 








1674-1675 1683 


1685 


-1688 


1699 








1702 1709-1710 


1715 


1717 


1722 








1724 1729 1731- 


1732 


1735 


-1739 








1741 1743-1744 


1748- 


-1749 


1755 








1760-1762 1767 


1773 


1778 


1785- 








1786 














diaphragm 


BIOL, nam 




137 2B2 


289 


730 


780 


986 


1409 








1478 1599 1614 










endothelial 


Strategene 


EDT0 01 


3 5-10 13 15-21 


24-26 29 


34 


37- 


cells 






39 42 44-45 


50- 


51 53-55 


57- 


58 








60-61 65-66 


68- 


69 73-74 


77- 


78 80 








82-83 85 87 


89 


93-96 101 


-105 108 








110 112- 


-114 


116 


118 


-122 


124 


128 








133-134 


137- 


142 


147 


-150 


152 


-153 








161-163 


166- 


-172 


176 


-179 


187 


190 








192 194 


196- 


201 


204-207 


210 


212- 








214 220 


224 


229 


-230 


233 


235 


-236 








240-241 


251- 


252 


258 


261- 


262 


265 








267-269 


272 


276 


-277 


279- 


281 


284- 








285 288 


290 


295 


-296 


301- 


302 


310- 








311 313 


316 


321 


325 


329 


331 


-333 








33S 340 


342 


351 


-355 


360 


371 


375 








380-382 


384 


387 


3 90 


392 


397 


400 








407-40B 


410 


412 


414 


416 


425 


-427 








431 434- 


-436 


439 


444 


-445 


449 


454 








463-464 


472- 


475 


477- 


-479 


486 


488- 








490 497- 


-498 


500 


-504 


510- 


513 


516- 








519 522 


524 


526 


-528 


532- 


S34 


536- 








540 542- 


-546 


548 


561 


•563 


566 


-567 








572-576 


579 


581 


585 


-586 


589 


593 








595 597 


599 


603 


607 


-612 


615 


-617 








620 622 


626 


630 


632- 


•634 


638 


-641 








644 647 


656-660 


662- 


-664 


670 


673 








678 680- 


•682 


692 


-697 


707 


709 


-710 








712-713 


719 


73 0 


732 


734 


736 


738 








743-746 


751 


759 


768 


771 


773 


775- 








778. 783 


786- 


789 


793 


800 


803 


805- 








807 810- 


811 


814 


816- 


-813 


821 


-822 








824 826 


82B- 


B29 


832 


834- 


838 


842- 








845 848- 


•850 


854 


-860 


862 


864 


869 








871 874 


876- 


879 


883 


885 


887 


890- 








891 894- 


■895 


898 


-900 


903 


908 


910- 








913 916 


919- 


922 


924 


926- 


928 


930- 








935 939 


943 


948 


-949 


951- 


954 


957 








959-961 


964 


969 


-970 


973 


'975 


-978 








983-984 


988- 


990 


992- 


-993 


996- 


-997 








1000 1002 1004- 


1013 


1016 


-1020 








1022-1025 1028 


1031 


1033 


-io: 


34 








1038-1046 1050 


1055- 


•1056 


1059- 








1060 1062-1064 


1067- 


•1070 


1072- 








1074 1076 1078 


1082 


1086 


-1087 








1089-1090 1093- 


1097 


1099 


-1103 








1107 1109-1113 


1116- 


1117 


1124- 








1126 1128-1131 


1134- 


-1135 


1138 








1140 1144-1145 


1148- 


1149 


1153 








1157 1160 1163 


1171 


1183 


-1184 








1198-1199 1202 


1205- 


1207 


1211 








1216-1217 1219 


1221 


122S 


1229 








1232-1235 1238- 


1241 


1243 


-1244 








1246 1250 1253 


1257- 


1258 


1261 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1277 
1290 
1317- 
1330 
1345- 
1367 
1400 
1424- 
1440- 
1468 
1491 
1511 
1531 
1547 
1561 
1579 
1592 
1615 
1631 
1650 
1669 
1696 
1719 
1736 
1755 
1771 



i2?r 

1280- 

1293 

1320 

1334- 

1347 

1369 

1406 

1426 

1442 

1472 

1493 

1516 

1536 

1549 

•1565 
1581 
1597 
1618 
1634 
1652 
1671 

-1698 
1722 
1739 
1760 

-1773 



1268" 
1283 
1295 
1324- 
1335 
1350 
1374 
1408 
1428- 
1448 
1474 
1501- 
1520- 
1537 
1552 
1568 
■1583 
1605- 
•1621 
1636 
-1659 
1675- 
1703 
-1723 
-1741 
1761 
1776 



1270- 
1285- 
1298 
1325 
1338 
1355- 
1376 
1414 
1431 
1450 
1478 
1504 
1521 
1539- 
1555 
1571 
1587- 
1606 
1624 
1638 
1664 
-1681 
1711 
1726 
1743 
1765 
1779 



1271 
1286 
1308 
1327 
1342- 
1356 
1379 
1417 
1434- 
1462 
1487 
1506 
1526 
1540 
1557 
1575 
1588 
1611 
1628 
1641 
1666 
1683 
1715 
1731 
-1744 
1767 
1783 



1274- 
1288- 
1312 
1329- 
1343 
1359 
1398 
1419 
1438 
1466 
1488 
1509 
1529 
1546- 
•1559 
1578- 
1590 
1613 
1630- 
1643- 
-1667 
-1688 
-1716 
-1733 
1749 
1768 
1786 



286 686 1297 1303-1304 1352 
1411-1412 1754 



Genomic clones 
from the short 
arm of 
chromosome 8 



Genomic DNA 
from 
Genetic 
Research 



EPM001 



131-132 261 289 380 503 860 892 
1000 1007 1397 



esophagus 



BioChain 



ESO002 



fetal brain 



fetal brain 



Clontech 



FBR001 



62-63 89 112 126 194 322 336-338 
379 391 411 481 546 563 607 679 
710 867 1012 1031 1055 1251 1262 
1320 1407 1643 1652 1686 1731- 
1732 1746 1765 



Clontech 



FBR004 



66-69 90-91 139 212-213 301 331 
362 374 403 436 611 645-646 659 
668 670 691 785 805 845 1163 
1209 1216 1232-1233 1238-1239 
1387 1410 1416 1430 1496 1536 
1547 1593 



fetal brain 



Clontech 



FBR006 



5-9 25 43 60 
80 87 92 101 
149 152-153 
207-208 210 
238 251-253 
301-302 307 
330 333-334 
357 370 373 
391-392 397 
411 417 421 
437 440-443 
476 483 488- 
513 516 519- 
544 547 550 
590-591 595 
623 628-629 
657-658 660 
689 691-694 
710 716 720 
744 757-760 
806-807 B10 
858 861 864 
894-895 898 
936 938 945 
959 961 963 



62-63 65-66 
103 108 114 
157 168 171- 
212-213 221 
266 272 279- 
310 317-318 
336-338 346- 
377 379-380 
399 402 406- 
424 426-427 
454 460 464 
489 495 497 
520 524 530 
561 567 572 
597 604 607 
631 634 638 
665 669 674 
696-697 699 
728 732 734 
763 775-778 
817-818 826 
871-872 884 
904 915 921- 
950 952 955- 
967 969-971 



72 



70 

136 139 
172 175 
226 237- 
281 295 
321-324 
347 352 
382 384 
408 410- 
430 436- 
467 473 
508 510- 
537-540 
574 582 
609 615 
640 655 
675 679 
701 706 
736 742- 
780 799 
839 843 
890-891 
923 935- 
956 958- 
990 992 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



999 1001 
1016 1022 
1035 1042 
1065 1067 
1114-1115 
1151 1153 
1172-1173 
1190-1200 
1226-1227 
1253-1255 
1270-1273 
1314 1317 
1339 1341 
1371 1373 
1386 1392 
1425-1426 
1440-1441 
1502-1503 
1519 1536 
1559 1573 
1611-1614 
1640 1651 
1693 1696 
1718 1720 
1730-1733 
1742 1745 
1767 1771- 
1786 



1005 
1024 
1047 
1070 
1119 
1156 
1178 
1211 
1229 
1258 
1281 
-1320 
1344 
1376 
1396 
1428 
1448 
1507 
1544 
1589 
1619 
1657 
1703 
1722 
1735 
1755 
1772 



1006 
1029 
1048 
1082 
1131 
1160 
1184 
1216 
1231 
1260 
1287 
1326 
1350 
1379 
1398 
-1429 
1466 
1511 
1549 
1590 
1621 
1658 
-1704 
1724 
1736 
1759 
1777 



1008 
-1030 
1052 
1089 
1143 
1163 
1186 
1222 
1236 
1262 
1308 
1334 
1356 
1381 
1419 
1432 
1470 
1513 
-1550 
1598 
1625 
1676 
1713 
1726 
1738 
-1761 
1779 



1013 
1032 
1056 
1109 
-1149 
1167 
1188 
1223 
1245 
1266 
-1309 
-1335 

1369- 
-1382 
1423 
1437 
1482 
1516 
1557- 
1608 
1626 
-1679 
1714 
1728 
1739 
1765 
1780 



fetal brain 
fetal brain 



Clontech 



FBRS03 



235-236 520 864 1068 1188 1587 



Invitrogen 



FBT002 



fetal heart 
fetal kidney 



Invitrogen 



Clontech 



15-18 20-21 24-25 29 34 43 61-63 
77-78 98 101 103 107-108 128 130 
136 146 148 165-166 171 174 181 
185 196-198 204-205 208 223 230 
235-236 251 253 261 268-269 280- 
281 284-285 288 309-311 321 329 
334 339 346-347 350 357-359 381- 
383 390 407 418-419 430 434-435 
438 443-444 461 464-466 483 490 
494 509 516 519 522 527 557 561- 
562 572-573 590-591 595 597 623 
632 647-648 650 655 669-670 672 
682 690-691 700-701 710 717 736 
746 782 784 788-789 814-815 825 
829 840-841 847 854-855 857-858 
897-900 904 919 925 935-937 946 
948-949 954 960-962 966 969-970 
986 996 1000-1001 1005-1007 1012 
1014 1022-1028 1045 1052 1055 
1068 1070 1072 1078 1082 1085 
1090 1109 1115 1118 1120*1128 
1136-1137 1144-1145 1149 1156- 
11S7 1193-1195 1198 1204-1205 
1220 1222 1234 1257 1262 1271 
1274-1275 1280 1285-1286 1294 
1312 1314 1317-1320 1330 1342 
1344-1345 1349-1350 1355-1356 
1358 1364 1369 1379 1383-1384 
1431 1435 1476 1507 1519 1532 
1536 1547 1554 1564 1567 1578 
1582 1587 1593 1595 1601 1608 
1615 1619-1621 1638 1644 1661 
1665-1666 1673 1687-1688 1690 
1715 1723 1728 1749 1753 1757 
1759-1761 1765 1771 1774 1776 
1778 1781-1782 1786 



FHR001 



105 124 1B0 299 864 1036 1148 
1229 1614 1616 1762 1785 



FKD001 



5-8 11 40 47 57 65-66 82 85 102 
124 163 171 216 222 224 235-236 
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SEQ ID NOS: 



Tissue Origin 



RNA Source 



Hyseq 
Library Name 



258 277 280-281 307 
371 387 392 395 403 
436 443 455 463 500 
563 572-573 585 600 
654 657-658 660 679 
798 821 833 844 854 
868 878 911 929 958 
992 1007 1046 1087 
1139 1285 1312 1331 
1371 1376 1391 1422 
1440-1441 1470 1543 
1618 1631 1651 1654 
1678-1679 1691-1692 



310 314 330 
422-423 431 
519 522 542 
619 623 650 
719 731 780 
855 857 864 
960 969 990 
1103 1129 
1355 1369 
1425-1426 
1598 1601 
1655 1669 
1733 1785 



fetal kidney 



Clontech 



FKD002 



352 384 426-427 440 583 602 1060 
1131 1324-1325 1636 



fetal kidney 



Invitrogen 
Clontech 



FKD007 



20-21 82 163 335 679 988-989 
1000 1227 1230 1320 1554 



fetal lung 



FLG0 01 



35-36 94 323 371 398 426-427 445 
473 549 560 604 616-617 626 631 
649 651 719 746 786-787 832 842 
849-850 864 894-895 1075 1178 
1182 1200 1206 1309 1311 1345 
1429 1493 1567 1576 1620 1686 



fetal lung 



Invitrogen 



FLG003 



9 15-16 29 41 47 68 
102 124 137 152-153 
229 231 249 254 256 
300 325 333 344-345 
379 384 408 426-427 
468 475 483 488 493 
545 547 549 564 582 
660 662-664 67D 673 
761 766-767 774 805 
864 875 921 932 937 
988-989 1014 1016-1 
1090 1097 1170 1185 
1216 1224 1258 1290 
1342 1347 1355 1369 
1414 1431 1438 1449 
1536 1547 1557-1560 
1601 1636 1644 1653 
1667 1671 1675 1680 
1739 1760-1761 1769 



69 83 88-89 
165 196 224 
267 291-292 
352 373 376 
430 432 467- 
516 531 535 
602 623 644 
725-726 728 
830 852-853 
946 949 963 
017 1024 1027 
1200 1215- 
1309 1320 
13B1 1413- 
1491 1512 
1567 1590 
-1655 1662 
1681 1706 



fetal lung 



Clontech 



FLG004 



103 276 334 
1614 1658 



465-466 737 843 1131 



fetal liver- 
spleen 



Columbia 
University 



FLS001 



3-11 13 15 
51 54 56-58 
77-80 82-83 
110 112 116 
135-139 141 
157 163-165 
180 186 188 
200 202-206 
233-236 240 
255-256 258 
274 276-278 
293 295 299 
311 314 
332 342 344 
358 360 362 
386-387 390 
406 408 410 
437 439-442 
456 459 461 
487-488 490 
506 509-513 
529 531 534 
553-554 561 
576 579 581 



21 25 30- 
60-66 68 
85 87 89 
124 126- 
144 147- 
167-172 
190 193 
210-214 
244 246- 
261-265 
280-281 
-301 304 

318 320- 
-345 350 
370-374 
392-393 
-412 415 
444-445 
470 472- 
-491 493 
515-520 
536-540 
-562 564 
583 585 



39 41-48 50- 
-69 72 75 

92-103 105- 
127 130 133 
149 152-153 
174 176-178 
194 196 198- 
219 221-231 
247 250-251 
268-269 272 
284-286 288 
306-307 309 
321 326 329- 
352-353 356- 
376 378-384 
400-401 403 
417 419 422- 
448 452-454 
479 481-483 
500-501 503- 
522-524 526- 
542 547-549 
567-568 571- 
597 599-605 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



fetal iiver- 
spleen 



Columbia 
University 



FLS002 



607 610-613 615-621 623-624 626 
628-634 636-640 644 647-650 655- 
660 665 669-670 672 674-675 678 
681-682 684 690-695 697 702 708- 
710 713-714 716-719 725-728 730- 
731 734 736 738 740-741 743-746 
748 750-751 759-766 768 772 7<74- 
777 779 783-788 793 796 798 800- 
805 808 810-812 814 818-819 821- 
B24 826-832 834-837 843-847 849- 
867 869-876 878-883 887 889-895 
B97-89B 902 904-914 916 919 921- 
928 930-937 939 945-950 953-958 
960-961 963-965 967 969 971 974- 
978 980-983 986 988-990 992-993 
995-997 1000-1002 1004-1008 1012 
1014 1016-1019 1025-1026 1028- 
1031 1033 1035-1036 1039-1044 
1047 1049-1050 1053-1056 1058- 
1059 1061-1064 1067-1070 1072- 
1074 1076 1078 1082 1085-10B7 
1089-1090 1097 1099-1103 1107- 
1113 1115-1119 1121-1123 1125 
1127-1128 1131-1134 1136-1137 
1144-1150 1153 1159-1160 1163 
1170 1175 1177-1178 1188 1190- 
1192 1195-1200 1202 1206 1208- 
1211 1214 1216 1218 1221-1222 
1225 1227 1234 1237 1241 1244 
1246-1247 1251 1254 1258 1261 
1266 1268 1270-1273 1277-1282 
1284-1285 1287-1290 1294 1299- 
1300 1306-1308 1313-1320 1324- 
1325 1327 1330 1332-1333 1338 
1341 1343 1345-1347 1349-1350 
1353-1360 1362-1363 1365-1367 
1369-1370 1372-1374 1376 1378- 
1381 1383-1384 1386 1389-1391 
1400 1402-1403 1405-1410 1413 
1415 1417-1419 1422-1429 1431 
1435-1437 1439-1442 1445-1446 
1448-1449 1454 1458-1459 1466- 
1470 1472 1474 1477-1478 1480 
1482 1485 1491-1493 1496-1498 
1501-1507 1509 1511-1512 1516- 
1519 1524-1526 1529 1532 1536- 
1541 1546-1547 1549-1550 1552- 
1554 1562 1564 1569 1572 1574- 
1575 1578 1581 1583 1587-1588 
1591-1592 1594-1595 1597-1598 
1600-1604 1611-1612 1614-1615 
1617-1618 1620-1622 1624-1625 
1627-1628 1630-1632 1634-1639 
1645-1651 1653-1662 1664 1667- 
1669 1671 1673-1674 1676-1688 
1690 1696 1701-1703 1706-1709 
1711 1713-1714 1718-1719 1722 
1724-1727 1731-1733 1738 1740- 
1741 1743-1744 1746 1748 1751- 
1752 1754 1760-1765 1767-1773 
1780 1783-1786 



3-11 13 15-21 26 29 32 35-39 42 
44-45 48 50-51 54-55 57-58 61 64 
68-69 73-75 78 80 82 84 87 95-98 
100 103 105 107-108 110 112-113 
116-119 122-125 128 130 137-138 
145 147-153 155 157 159 161-163 
166 168 171-172 174-175 177 181 
188-189 193-194 196-198 200-203 
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Tissue Origin 


RNA Source 


Hyseq 
Library Name 


SEQ ID NOS: 








206 


212- 


215 219- 


-221 


223 


225- 


229 








231- 


232 


240-244 


246- 


247 


250- 


251 








258- 


259 


262 264 


268- 


269 


272 


275 








277 


280- 


281 284 


286 


288 


290- 


292 








295 


298- 


299 301 


-304 


305 


308- 


310 








318 


320- 


321 323 


325 


329 


331 


334 








342 


348- 


349 352 


-353 


356 


359 


368 








371 


374 


376-379 


381- 


384 


386- 


387 








392- 


393 


397-398 


400- 


401 


403 


410- 








413 


421 


423 426 


-427 


429- 


-430 


433- 








436 


438 


440 443 


445 


448 


451-452 








454- 


•455 


460-463 


465- 


467 


469 


471- 








473 


475- 


-476 478 


-479 


481-483 


487 








490- 


•491 


493-494 


497 


500- 


-501 


503- 








505 


509- 


-513 515 


-517 


519- 


-520 


524 








526 


-531 


534 537 


-542 


544 


547 


552- 








554 


556 


558 561 


-562 


564 


-567 


571- 








577 


583- 


-587 590 


-591 


593 


595 


597 








601 


604- 


-606 608 


-613 


616 


-617 


619- 








€24 


626 


-632 634 


637- 


642 


644 


647 








649-652 


654-659 


662- 


665 


669 


-672 








€74- 


-675 


€81-€82 


685 


688 


690 


696 








698 


700 


-703 707 


709- 


•710 


713 


717 








719-721 


723-724 


728 


731 


-732 


734 








737 


-738 


742-745 


748 


752 


754 


759 








763 


-766 


76 


B 770 


773- 


-777 


780 


782 








784 


786 


791 795 


-798 


801 


-802 


805 








808 


811 


-812 818 


823- 


-824 


826 


-827 








832 


834 


-83 


7 839 


843 


846 


848 


-856 








858 


-861 


865 867 


869 


871 


873 


-874 








876 


878 


86 


1-882 


887 


889 


892 


894- 








898 


901 


-902 904 


906-908 


913 


-915 








919 


921 


-924 926 


-932 


934 


-935 


937 








939 


-941 


943 946 


-947 


950 


953 


958 








961 


965 


-967 971 


973 


-975 


977 


-979 








981 


984 


-98 


5 990 


992- 


-993 


995 


-997 








999 


1001 1004-1007 1009 


-1011 








1013 10 


16 


1020 


1023 


1025 10 


27- 








1031 1033- 


1035 


1039 


-1042 1044- 








1045 1049 


1053 


1055 


-1056 1058- 








1059 1062 


1064- 


1065 


1067-10 


70 








1072-1074 


1079 


1082 


1087 10 


89 








1093 1097 


1099- 


1103 


110 


5-1107 








1109-1114 


1123 


1125 


-112 


7 1132- 








1134 1140 


1143- 


1145 


114 


8-1150 








1156 1158 


1160 


1163 


1172-1173 








1177-1178 


1181- 


1184 


1190-1192 








1195-1197 


1199 


1204 


1206 1208 








121 


1 1214 


1216 


1219 


1227 1230 








1234-1235 


1237 


1240 


-124 


1 1243 








1245 1247 


1256 


1258 


1260-1261 








1264 1268 


1270- 


1271 


1275 1278- 








1279 1284-1286 


1288 


-128 


9 1299- 








1301 1306 


1308 


1312 


1314 1317- 








1319 1323-1325 


1327 


-133 


0 1334- 








133 


5 1339 


1343- 


1347 


1349-1350 








1354-1355 


1357 


1360 


1362-1363 1 








1365-13 


67 


1369 


1372 


1376 1378- 








1380 13 


86 


1389- 


1391 


1394 14 


00 








1403 1406 


1409 


1416 


-1419 1422- 








1427 1429 


1435 


1437 


-143 


8 1440- 








1442 1446 


1448- 


1450 


1453 1460- 








1461 1468 


1470 


1472 


1474-1475 








1478 14 


82 


1486 


1490 


-1493 1496 








1498 15 


00- 


-1S04 


1506 


1508-1509 








1511-1512 


1516 


1518 


-1519 1521 








1524-1528 


1531 


1536 


-153 


8 1543 








1547 1550 


1554 


1556 


1564 1567- 








1569 1580 


1587- 


1588 


1591-1592 
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Tissue Origin 



tetal liver- 
spleen 



fetal liver 



RNA Source 



Columbia 
University 



Hyseq 
Library Name 



FLS003 



SEQ ID NOS: 



1597 
1618 
1641 
1661 
1676 
1691 
1713 
1727 
1744 
1763 
1776 



-1598 
-1628 

1646- 
-1662 
•1679 
•1692 
•1714 

1730- 

1748- 
•1764 

1779 



1600- 

1630- 

1649 

1664 

1683- 

1699 

1717 

1733 

1752 

1767 

1783- 



1601 

1631 

1652 

1667- 

1684 

1702 

1719 

1738 

1758 

1769 

1786 



1611 
1635 
1654 
1669 
1686 
1707 
1722 
1740 
1760. 
1772 



-1612 
-1638 
-1659 
1674 
1688 
1711 
1726- 
1743- 
1761 
1773 



103 300 318 321 352 
384 392-393 403 422 
435 440 444 453 503 
978 1064 1324-1325 1 
1357 1369 1378 1418 
1646 1S49 1680-1681 
1717 1743-1744 1769 



372 379 381 
424 429 434- 
515 544 592 
327 1333 
1424 1622 
1689-1690 



Invitrogen 



FLV001 



fetal liver 
fetal liver 



15-16 26 34 58 61 64 70 75 78 89 
98 105 112 116 120-121 123 133 
151 165 176 180 194-196 198 200 
204-206 210-211 220 225-226 230 
235-236 239 247 259 261 267 272 
277 280-281 303 310 313 317 320- 
321 329 344 356 371 374 376 379- 
382 395 408 412 414 419 429 434- 
435 441-442 465-466 490 494 504- 
506 509 522 527 534 552-553 562 
567 569-570 572-574 607 631 657- 
658 667 669 672 685-686 702 717 
725-726 732 748 759 761 778 784 
786 809 817 829 837 857 861 872- 
873 875 881 889 894-895 909 911 
916 954 963 967 974 977 986 988- 
989 993 995 997 1000 1005-1006 
1008 1014-1015 1020 1042-1043 
1070 1086-1087 1089-1090 1118- 
1119 1122 1144-1145 1148 1153 
1157 1159 1183 1195-1196 1227 
1250 1257-1258 1262 1267 1280 
1285 1307 1312 1314 1317-1320 
1344-1345 1349-1350 1355 1362- 
1363 1403 1405 1415 1419 1425- 
1426 1429 1431 1442 1448 1463- 
1464 1469-1470 1489 1528 1536 
1539 1549-1550 1557-1562 1577 
1583 1598 1601 1611 1615 1622 
1644 1649 1666 1674 1706 1721 
1738 1746 1763-1765 1774 1776 
1779 



Clontech 
Clontech 



FLV002 



676 998 1719 



FLV004 



fetal muscle 



93 133 214 301 355 374 379 555 
581 601 679 837 847 859 1123 
1236 1270 1313 1324-1325 1327 
1355 1367 1425-1426 1536 1690 
1733 1760-1761 



Invitrogen 



FMS001 



26 37-39 50-51 58 84 8$ 89 98 
113 128 131-132 139 155 172 186 
194 198 201 206 211 230-231 256 
261 276 282 286 302 325 359 361 
376 379 383 398 412-413 419 430 
436 448 452 462-463 473 477 503 
519 529 561 569-570 590-591 597 
607 623 626 635 647 660 672 715 
725-726 730 733 761 775-777 788 
826 837 860 874 913 915 921 935 
970 980 986 988-990 992 1000- 
1001 1007 1014 1027 1035-1036 
1045 1060 1064 1070 1083 1097 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1099 
1173 
1266 
1324 
1383 
1433 
1557 
1632 
1712 
1766 



1102 
1198 
1270 
•1325 
•1384 
1505 
1559 
1644 
1725- 



1116- 

1208 

1277 

1329 

1399- 

1514 

1562 

1650 

1726 



1117 

1228 

1298 

1336- 

1400 

1542 

15B9 

1652 

1743- 



1121 

1240 

1317- 

1337 

1403 

1551 

1599 

1671 

1744 



1164 
1258 
1320 
1369 
1409 
1554 
1620 
1675 
1754 



fetal muscle 



Invitrogen 



FMS002 



119 221 273 402 426-427 463 547 
599 736 869 1000 1033 1083 1266 
1431 1440-1441 1468 1545 1599 
1673 1678-1679 1687-1688 1710 
1712-1714 1723 1725 1731-1733 
1743-1744 1760-1761 1767 



fetal skin 



Invitrogen 



FSK0 01 



1 4-11 15-16 20-23 25 29 33 40 
43 46 56-57 60-61 64-66 75 82 87 
97-98 105 107-108 113 118-119 
123 133 135-137 139 144 146 148 
151-153 156 163 170 176 180 188- 
189 197-198 200 202-203 210 218 
222 231 246-247 261 263 265-270 
277 285-286 290 293 299 301 307 
311 321 325 328 330 333-335 339 
341 345 351-352 355-356 358-359 
362 368 370 372 376 379-382 384 
388 394 404-405 408-409 411-412 
419-420 424 426-427 436 441-442 
445 448-449 454 462 465-466 472 
476 490 493 504 506 509 515-517 
519 526 531 537-540 547 549 560- 
561 567 572-573 581 584 589 611- 
612 615 623 630-631 635 647 649 
651 657-658 660 662-665 667 669 
672 676 678 681 688 701 704-705 
709-710 713 717 720-721 725-726 
728-729 732 748 750 753 759 764 
766 770 775-777 780-781 786 788- 
789 798 809 811 814 816-817 822 
824-826 831 842 857 859 861 863- 
864 881 894-895 908 910-911 916 
918 922-923 928 932-933 935 937 
946 948-949 953 960-961 966-967 
970 975 977 986 990 992-993 999- 
1000 1004 1007 1013 1018 1025 
1027 1032 1035 1041-1043 1054 
1057-1058 1060 1062-1064 1069 
1072 1077 1090-1091 1097 1099- 
1103 1108 1113 1119 1123 1128 
1131 1134 1140 1148-1149 1152- 
1153 1156 1163 1167 1178 1182 
1189 1192 1195-1196 1198 1201- 
1205 1208 1211-1212 1216 1219- 
1220 1222 1225 1240 1243 1258 
1266-1267 1274 1277 1280 1282- 
1285 1299 1310 1317-1322 1324- 
1325 1329-1330 1342 1344 1346 
1349-1351 1354-1357 1365-1366 
1369 1371 1373 1376 1378 1380 
1383-1384 1387 1399-1400 1405 
1410 1427 1429 1431 1433-1435 
1439-1441 1448-1449 1454 1457 
1468 1470 1472 1475 1480-1481 
1487 1490-1491 1493 1498 1509 
1512 1521 1525-1526 1529 1535- 
1536 1547 1549 1557-1S59 1588 
1592 1595 1597-1598 1601 1603- 
1604 1608 1611 1614 1618 1624- 
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SEQ ID NOS: 



Tissue Origin RNA Source 



Hyseq 
Library Name 



1626 

1644 

1665 

1702- 

1724 

1742 

1765 

1786 



1632 
1646 
1668 
1703 
1727 
1747 
1772 



1634 

1654- 

1675 

1709- 

1731- 

1749 

1776- 



1636 
1657 
1685 
1710 
1732 
1755 
1777 



1641 1643- 

1660-1662 

1687-1689 

1716 1719 

1737-1740 

1760-1761 

1779-1780 



fetal skin 



Invitrogen 



FSK0 02 



13 286 
339 341 
408 414 
515 544 
1076 11 
1333-13 
1371 13 
1466 16 
1688 16 
1732 17 



302 307 
354 370 
426-427 
585 598 
09 1155 
35 1343 
77-1378 
47 1656 
93 1718 
39 1755 



313 321 330 335 
372 385 400 402 
433 436 450 454 
767 810 845 S39 
1317-1320 1326 
1347 1350 1369- 
1391 1397 1422 
1678-1679 1687- 
1721 1725 1731- 



BioChain 



110 137 211 353 589 927 1108 
1639 1771 



fetal spleen 



FSP001 



umbilical cord 



BioChain 



FUC0 01 



4-8 10 12 14 17 33-36 44-46 57 
64 68-69 75 82 85 101 104 113- 
114 116 119 122-124 133 137 153- 
154 157 161 163 166-167 175 181- 
184 186 192 197-198 200-202 212- 
215 230 234 246-247 251 256 263 
267 271-272 280-281 284 295 301 
314 317 321 326 333-335 345 351 
356 368 371-373 379-380 386 390 
392 394 406 408-410 412 414 416 
420 424 427 430-436 438 444-446 
454 459 461 463 467 473 482-483 
486 488 490 495 504 509 524 526 
537-540 547 555 561 574-577 588- 
591 593 606 615 620-621 632 637 
645-647 650 659-660 662-664 667- 
668 674-675 684 687 696 698 701 
703-705 709 711 714 719-720 725- 
727 732 749-750 762 765 771 775- 
777 780 789-791 793 796 802-803 
814-817 822 833 843 845 848 858 
861 864 875 879 888 894-895 897- 
900 903 906-907 911-912 925 930- 
933 936 940 948 953 960 966 977 
984 990 992 998 1000-1001 1005- 
1007 1016 1023 1025 1037 1046- 
1047 1059 1061-1063 1073 1076- 
1077 1089 1094-1097 1112-1113 
1115 1134 1144-1148 1151 1154 
1156 1163 1171 1197 1204-1205 
1208 1216 1218 1224 1234-1235 
1243-1244 1246 1279 1283 1286- 
1287 1298 1316 1320 1344 1346 
1350 1357 1359 1371 1373 1375 
1381 1398 1400 1403 1409 1414 
1424 1427-1428 1431 1433 1440- 
1442 1446 1454-1455 1479 1482 
1484-1485 1489 1492-1493 1504- 
1505 1513 1525 1527 1536 1538 
1546 1565 1567 1571 1573 1575- 
1576 1578-1579 1591 1595 1600- 
1601 1608 1612 1615 1621 1624 
1626 1636-1637 1647-1648 1651 
1653 1656 1658 1661-1662 1672 
1675 1682 1684 1686-1688 1690 
1709-1710 1722 1727 1729 1735- 
1738 1740-1741 1760-1761 1768 



fetal brain 



4 9 11-13 17-18 22-23 25 37-39 
42-47 50-51 54-55 58 60-61 65-66 



GIBCO 



HFB001 
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Tissue Origin 


RNA Source 


Hyseq 
Library Name 


SEQ ID NOS: 








72 1 


5 77 80 82 


B5 9C 


-91 


94 100- 








102 


107 


110 112 


-116 


118- 


119 


122- 








123 


126 


128 134 


136- 


140 


147- 


148 








153- 


155 


157 161 


165 


169- 


172 


175 








181 


186 


188-189 


197- 


198 


204- 


206 








208 


210 


215 222 


-223 


225- 


226 


230 








235- 


238 


240-241 


247 


253 


256- 


-258 








260- 


262 


267-269 


276 


279- 


281 


234 








286 


289 


298 300 


-302 


307 


310 


318 








321- 


323 


325 330 


-331 


339 


341 


346- 








349 


352 


354 356 


-359 


362 


364-365 








371-372 


377 379 


-380 


382 


384 


387 








390 


400 


408 414 


-416 


419 


424 


431 








434- 


435 


438 441 


-443 


449 


451 


453- 








455 


457- 


•463 470 


472- 


473 


475 


477- 








478 


482- 


-483 486 


-4 88 


490- 


491 


493 








496 


499- 


-500 502 


-504 


506- 


507 


509- 








512 


516 


519-520 


522 


525- 


526 


529- 








530 


537- 


-540 543 


-544 


546- 


547 


566- 








567 


569- 


-570 572 


-582 


585 


588 


590- 








591 


593 


595 599 


601 


604 


606 


-609 








611- 


612 


614-620 


622 - 


624 


630 


632 








636 


643 


645-647 


650- 


-652 


654 


659 








661 


665 


667-668 


670 - 


■672 


676 


678 








681 


687 


689 692 


-694 


697 


699 


710 








714 


717 


721 727 


729 • 


-732 


734 


736 








73 8 


743 


-746 750 


-751 


759 


763 


766 








770 


772 


775-777 


784 


789 


791 


796 








799 


802 


-805 810 


-811 


814 


819 


-821 








824 


826 


830 834 


-837 


839- 


850 


854- 








856 


858 


-860 862 


864 


869 


871 


876- 








877 


879 


883 886 


-887 


890- 


891 


893- 








895 


898 


-901 905 


908- 


-910 


912 


-916 








919 


922 


-923 925 


927 


930- 


933 


935- 








938 


948 


952-960 


963 


-964 


967 


969- 








972 


975 


978-979 


981 


983 


986 


-987 








990 


992 


995 997 


999 


-1002 


1005- 








1009 1011-1013 


1016 


1018 


-10 


19 








1023 1026 1029- 


1031 


1033 


-1035 








103 


9 1041 1047 


1050 


1053 


1057 








1059 1064 1068 


1070 


1072 


-1073 








1078-1079 1081- 


1082 


1086 


10 


39 








1094 1097 1103 


1107 


-1109 


1113- 








1115 1121-1122 


1127 


1134 


-1135 








1138 1140 1143 


1148- 


-1151 


1153 








1156-1157 1159 


1167 


1170 


1175 








1193-1194 1200 


1202 


1207 


-1209 








1211 12 


16 1219- 


1220 


1226 


-1227 








1229 1232-1234 


1240 


-1241 


1243 








1246 12 


49-1251 


1253 


-1254 


1258 








1267-12 


68 1271 


1276 


1279 


12 


32 








1285-12 


89 1293- 


1294 


1305 


13 


07- 








1308 13 


12 1316 


1320 


1327 


1338- 








133 


9 13 


41-1344 


1346 


1349 


13 


55- 








1357 13 


59 1365- 


1366 


1369 


-13 


70 


■ 






1373-13 


75 1379 


1386 


1389 


13 


94 








1398 14 


09 1413- 


1414 


1416 


-1417 








142 


0-14 


21 1425- 


1427 


1430 


14 


33 








143 


7 14 


39 1442 


1445 


-1452 


14 


54- 








1457 14 


59 1463- 


1464 


1468 


14 


70 








1474 14 


77-1479 


1489 


1492 


14 


94 








149 


7-14 


98 1501- 


1503 


1507 


15 


39 








1511-15 


13 1517 


1520 


-1521 


1524- 








1526 1531-1533 


1535 


1537 


-1538 








1547 1554 1556- 


1559 


1564 


-1567 








1571 1584 1587 


1589 


1594 


1599- 








1601 16 


11-1612 


1614 


-1616 


1619- 








1620 1625-1628 


1630 


-1631 


1634 








1637-1638 1640- 


1643 


1645 


1648- 
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Tissue Origin 


RNA Source 


Hyseq 






SEQ 


ID NOS: 








Library Name 




















164 


9 1651 1653- 


1655 


1657-1658 








166 


4-1665 1667 


1669 


1673 1678- 








167 


9 1683-1 


684 


1686 


169 


3 1701 








170 


4-1705 1709 


1713 


-1714 1717- 








172 


0 .1724 1 


727- 


1728 


173 


1-1733 








173 


7-1738 1743- 


1744 


1752 1754- 








1755 1757 1 


760- 


1761 


1765 1772 








177 


9 1785 










macrophage 


Invitrogen 


HMP001 


5-8 


110 204 


-205 


503 


634 


67 8 859 








878 


933 988 


-989 


137 


9 14 




infant brain 


Columbia 


IB2 0 02 


10 


12-13 15 


-18 


22-2 


3 25 


29 34 




University 




37- 


39 43 47 


50- 


51 5 


4 -56 


58 60-63 








65-66 68-69 


72- 


74 8 


0 82 


- 83 86 








88- 


92 97 10 


0 102-10 


4 10 


5-108 110 








112 


-113 115 


-116 


118 


123 


128 130 








134-136 138 


-139 


143 


147 


-149 151- 








152 


154-155 


163 


165 


-167 


169 172- 








175 


181-184 


186 


193 


-196 


198 201 








203- 


-205 209 


-210 


214 


-215 


222 224- 








226 


231-232 


235 


-236 


239 


246-247 








252 


257 260 


268 


-269 


272 


276-277 








279- 


-281 286 


288 


291 


-292 


295 298 








300- 


-301 304 


307 


310 


313 


321-323 








330- 


■331 333 


-334 


339 


346- 


-347 349 








352 


356-357 


362 


371 


-372 


377 379- 








380 


383-384 


392 


397 


401 


406 408 








411 


413-414 


416 


418 


-419 


422 428 








430- 


431 434 


-435 


438 


443 


449 453- 








454 


461 464 


-466 


469 


-470 


472-473 








475- 


•476 478 


482 


-483 


487 


490 492 








4 94 


497 503 


507 


-508 


510- 


513 516 








519- 


520 524- 


-526 


530 


-534 


536-540 








547 


550-551 


561 


563 


-564 


566-567 








572- 


576 579 


581 


-582 


584- 


587 590- 








591 


593 595-597 


607-609 


611-613 








616- 


617 620 


622- 


-624 


627 


631 637 








641 


645-647 


650- 


-655 


657- 


658 660- 








665 


667-675 


689 


691 


695 


697 699 








7 03 


707 713> 


•715 


717 


721 


728-731 








733- 


736 739 


743 


745 


751 










763 


769-770 


772 


778 


780- 


781 785 








788- 


789 793- 


794 


799 


803 


□AO Oil 

oOB oil 








814 


825-826 


830 


834-836 


840 - 84 3 








845 


848-850 


854- 


855 


860 


862 864- 








865 


870 872 


875- 


876 


878 


p. £ a a a 








890- 


891 894- 


896 


898 


903- 










917 


919 922- 


925 


927- 


928 










934- 


936 938 


941 


945- 


946 










953- 


954 959- 


962 


966- 


969 


0*7*7 Q70 








981 


986-990 


992 


997 


999- 


100Q 








1004 


-1006 1014 1 


016 


1018 


-1019 








1024 


-1025 1033 1036 


1047 


1051- 








1052 


1054-1055 1 


057- 


1059 


1063- 








1064 


1068-1070 1 


073 


1081 


-1082 








1085 


1089 1108-1113 


1118 


-1120 








1123 


-1124 1130 1132- 


1138 


1140 








1149 


1151 1153-1154 


1163 


-1170 








1172 


1174-1175 1183- 


1184 


1188 








1190 


1193-1194 1196- 


1197 


1199 








1204 


1208-1209 1211 


1218 


-1222 








1226 


-1227 1229 1231 


1234 


1241 








1247 


1249 1251 1256 


1258 


1261- 








1262 


1269 1274 1279 


1281 


1283 








1285 


1267-12 


89 1294- 


1295 


1305 








1307 


1313-13 


14 1 


316- 


1320 


1329 








1332 


1341-13 


42 1 


345 


1349 


1356 








1362 


-1363 13 


65-1366 


1368- 


-13 70 








1374 


1381 13 


83-1384 


1388 


1400 








1403 


1406-14 


07 1413 


1417 


1420 
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Tissue Origin 



RNA Source 



Hyeeq 
Library Name 



SEQ ID NOS: 



1423 


1429 


- 1431 


1435 


-143 6 


1439 - 


1441 


1443 


1447- 


-1449 


1451* 


- 1452 


1454-1455 


1*3 / 


145 9 


14 63 


- 1465 


1 / CO 


linn. 


1 ATI 


1475 


1479 


1482- 




x*± a d 


1493 


-14 94 


1496 


1498 - 


1499 


1502- 


-1503 


1505- 


• 1507 


1509 


1522- 


-1523 


1525 


1528 


1531 - 


-1533 


1542 


1546- 


-1547 


1549- 


- 1550 


1554- 


1555 


1563 


1565- 


-1567 


1569 


1575 


1580 


1583- 


-1586 


1588 


1590 


1592- 


1593 


1595 


1598 


1600- 


-1601 


1608- 


1610 


1612 


1614- 


-1616 


1619 


1621 


1624 


1626- 


-1627 


1630- 


-1633 


1637 


1639- 


-1640 


1642 


1644 


1647 


1652 


1654- 


1655 


1658- 


-1659 


1664- 


1665 


1672- 


1673 


1676- 


-1681 


1685- 


1688 


1693- 


1695 


1701- 


-1702 


1704 


1708 


1717- 


1720 


1723- 


1724 


1726- 


1728 


1733 


1735- 


1741 


1743- 


1744 


1752 


1755- 


1758 


1762 


1765 


1771 


1774 


1777- 


1778 


1786 









infant brain 



Columbia 
University 



IB2003 



infant brain 



17-18 20-23 29 34 43 60 68-69 
78-80 88 100-101 107 110 112 118 
123 128 133 135-137 146 148 152 
159 166 169 174 194 198 203 215 
223 225-226 229 235-236 247 260 
276-281 286 290-292 295' 300-301 
310 322 324 331 334 339 346-347 
349-350 352 357 371 376-377 382 
384 403 408-409 414-415 453-455 
472 476 478-479 490 503 507 516 
520 530 534 536-540 551 563 572- 
576 585 587 590-591 593 595-596 
601 606 612 616-617 620 622-624 
650 652-653 661 665 670-671 674- 
675 678 689 715 717 727-728 730 
734 759 775-777 780-781 785 796 
806-807 811 824 845-846 864 869 
875 882 889 894-895 898 904 917 
919 921-923 932 935-936 946 950 
954 962 977 979 997 999-1000 
1005-1006 1009 1011 1017 1024 
1033 1037 1043 1055 1057 1109 
1114-1115 1120 1123 1127 1144- 
1145 1149 1151-1153 1160 1167 
1170 1174 1193-1194 1196 1199 
1202 1206 1209 1220-1221 1226 
1229 1240-1241 1251 1258 1284 
1288-1289 1305 1314 1327 1333 
1344 1347 1350 1356-1357 1365- 
1366 1378-1379 1388 1400 1403 
1421 1423 1431 1436 1440-1441 
1446-1447 1457 1459 1471 1499 
1503 1507 1509 1536 1546 1557- 
1559 1567 1572 1587 1595 1598 
1610-1612 1615 1631 1639 1644 
1647 1657-1658 1673 1678-1681 
1683-1684 1701-1702 1708-1709 
1713-1714 1719 1757 1760-1761 
1765 1771 1778 



Columbia 
University 



IBM002 



infant" brain 



101 113 139 152 260 279 290-292 
374 377 551 563 608-609 653 659 
814 954 1005-1006 1029-1030 1130 
1164 1209 1258 1294 1305 1320 
1327 1397 1431 1498 1507 1615 
1640 1694-1695 1763-1764 1767 

1779 

10 12 119 175 279-281 321 334 
371 446 551 563 623 652 667 669 



Columbia 
University 



IBS001 
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Tissue Origin 



lung, 
fibroblast 



lung tumor 



RNA Source 



Strategene 



Invitrogen 



Hyseq 
Library Name 



LFB001 



LGT0 02 



SEQ ID NOS: 



671-672 819 949 966 1113 1130 
1151 1188 1193-1194 1196 1229 
1258 1265 1271 1267 1317-1319 
1324-1325 1342 1423 1440-1441 
1448 1471 1482 1525 1532 1546 
1562 1569 1588 1591 1610 1618 
1647 1649 165B 



5-9 17 20-21 25 68-69 82 94 105 ' 
153 157 197-198 203 207-208 212- 
213 223 262 266 283 302 321 326 
333 356 370 427 430 436 446 462 
472 493 498 503 516 519 527 535 
537-540 542-544 562 565 567 586 
599-600 607 615 630 647 662-664 
692-694 712 719 745 748 775-777 
794-796 810 837 843-847 849 854- 
856 869 876 903 934 953 955-956 
964 975-976 984 1000 1005-1007 
1024-1025 1033 1039 1053 1064 
1070 1072 1082 1112-1113 1134 
1136-1138 1140 1195 1223 1232- 
1233 1246 1279 1285 1295 1311 
1320 1334-1335 1343 1427-1428 
1446 1478 1482 1493 1504 1537 
1552 1555 1567 1575 1582 1598 
1620 1625 1632 1638 1645 1654- 
1655 1662 1680-1681 1684 1686 
1690 1696 1702 1711 1733 1741 
1760-1761 1778 1785 



5-10 18 20-21 29 33-36 40 43 52 
54-55 61 65-66 68-70 73-75 80 85 
88-89 93-94 100 103 106-108 112- 
113 115-116 118-119 123-124 126 
130-132 135-137 139-141 143-144 
147-148 151-153 155-156 159 161 
164 169 171 179-180 185 190 192 
194 196-199 203-208 210 212-214 
216-217 219 222 233 240-241 244 
246 251-252 255-256 261-262 266. 
272 276-277 279-281 284 286 288 
290 295 298 301-302 309-312 317 
321 329 332 341-342 344-345 348 
352 358-360 363 368 370-371 376 
380-381 384 389-390 398 400 409 
414 423 426-427 430 432-436 443- 
444 450-451 454 462 468 472-477 
480-483 487-488 490-491 493 496- 
498 500 503-506 509-512 515-516 
519 521-523 526 530 534 541 544 
547 554 557 564 566-567 572-576 
585-586 588-589 595-596 601 607 
611-612 615 619 621 623 626 630 
632-633 644 647 649 651 655-656 
660 662-665 667 669 672 6B3-684 
696 700 706 710 713 716 718-719 
722-723 728 734-739 743 750 752 
763 765-766 773-778 784-785 787- 
789 791 800 802-803 809-812 814 
824 826 828-829 832 838-839 841- 
845 849-850 852-855 857-861 864 
866 874 878-880 882 887 890-891 
897-898 902 904 906-907 910 916 
918-920 922 924-925 927 930-932 
934-935 937 947 950 953 955-956 
961 963 966-967 969 971 977-979 
981 984 986-987 990 992-993 995 
997 999-1001 1005-1007 1009 
1012-1013 1018 1020 1022-1024 
1026 1029-1030 1033 1038 1041 
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Tissue Origin 


RNA Source 


Hyseq 
Library Name 






SEQ 


ID NOS: 










1045 


1047 


-1050 


1052 


1054 


-1055 








1059 


1063 


-1064 


1067 


- 1071 


1073 - 








1074 


1078 


1085 


1087 


1089 


1095- 








1097 


1104 


1106 


-1107 


1109 


1112 








1116 


-1117 


1119 


1126 


1134 


-1135 








1139 


1141 


-1142 


1144 


-1145 


1148 








1152 


-1153 


1156 


-1158 


1167 


1170 








1172 


1178 


1195- 


-1196 


1198 


-1200 








1202 


1204 


1208 


1214 


1216 


1219 








1222 


1227 


1234 


1241 


1247 


12 52 








1257 


-1258 


1265 


1267 


- 1270 


1276 








1278 


12 80 


-1281 


1283 


1285 


1288- 








1289 


1295 


13 00 


1305 


1308 


1312 








1317 


-1321 


1329 


1338 


-1339 


1341 








1344 


-1346 


13 49- 


-1351 


1353 


-1355 








1357 


13 65 


-1366 


1369 


1378 










1383 


-13 85 


13 94 


1397 


1400 


1402 - 








1403 


14 0 8 




1419 


1423 


-1426 








1431 


1433 


-143 6 


1438 


1444 


1446 - 








1448 


14 54 


-1455 


1460 


1466 


1468 








1470 


14 74 


1 / on 


1481 


1483 


1486- 








1488 


14 90 


- lli J 1 


1494 


-1496 


1506 








1508 


- lo u y 


1 CI v 
1311- 


1512 


1515 


-1516 








1 CI Q 


i C51 




1528 


-1529 


1536- 








1 JtU 


X 5% O 


Ids y - 


1550 


1555 


1560- 








1561 


1565 


1567 


1569 


1575 


1588 








1591 


1593 


-1594 


1596 


-1598 


1600- 








1602 


1608 


1614- 


1616 


1618 


1620 








1624 


-1625 


1627- 


1632 


1636 


1639 








1644 


-1645 


1647- 


1649 


1652 


-1653 








1656 


-1662 


1664 


1666 


-1667 


1670- 








1671 


1673 


-1675 


1678 


-1679 


1683 








1685- 


-1688 


1690- 


1692 


1696- 


-1699 








1705 


1709 


1716- 


1717 


1722 


1727 








1730 


1735 


1739 


1741 


1743- 


-1744 








1748-1749 


1753 


1760- 


-1762 


1765 








1767 


1770-1771 


1773 


1775-1776 








1778- 


-1779 


1786 








lymphocytes 


ATCC 


LPC001 


4 11- 


-12 18 24-25 30- 


31 48 50-51 








56-57 68-69 80 


92 98 103 


105 110 








126 137 152-153 


157 


165 172 188- 








189 197 203 210 


217- 


218 222-223 








225-226 229 231 


247 


251 256 264 








272 280-281 284 


300- 


301 3 


21 325- 








326 3 


39 34 


8 352 


357 


371 3 


82 384 








390 400 404 412 


414 


421 423 426- 








427 430-43 


1 445 


447- 


448 451 454- 








455 4 


75 503 516 


526- 


527 530 537- 








540 549 556-560 


563 


574 577 589 








602 6 


13 61 


5-617 


621 


623 628-630 








636-637 64 


7 649 


657- 


659 690 697 








717 723 755 764 


775- 


777 7 


80 786 








789-790 793 800 


802 


822 838 849 








866 669 876 881 


-883 


892 898 906- 








907 911 921-923 


928 


975 990 992 








996 1001 1004-1007 1033 1050 








1054 


1078 


1107 


1135 


1140- 


1141 








1143 


114B 


1158 


1163 


1177 


1199 








1205 


1216 


1226 


1231 


1236 


1241 








1244 


1250 


1258 


1260 


1265 


1269- 








1271 


1290- 


1293 


130B 


1312 


1317 








1319- 


1320 


1339 1345- 


1346 


1348 








1350- 


1351 


1357 


1367 


1369 


1379 








1381 


1383- 


1384 1386- 


1387 


1389 








1394 


1397 


1405 1423 


1425- 


1428 








1431 


1437 


1446 1448 


1461 


1466 








1470 


1472 


1474 1482 


1492 


1506 








1528 


1537 


1546 1549 


1591 


1598 








1600 


1603- 


1604 1606 


1627 


1636 
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Tissue Origin 



SEQ ID NOS: 



RNA Source 



Hyseq 
Library Name 



1638 1647-1649 1651 
1664 1676-1677 16B0 
168B 1699 1711 1715 
1728 1737 1740 1746 
1756 1758 1777 1779 



1658-1659 
1681 1687- 
-1716 1726 
1748 1752 



leukocyte 



GIBCO 



LUC001 



3-4 10-11 13 15-18 
30-31 35-36 40 43-4 
54-58 60-63 68-69 7 
85 88-91 93-96 98 1 
107-108 112 116 119 
134-140 142 147-149 
157 162-163 167 169 
179 186 190 192-199 
212-215 217-219 222 
236 247 251 255-258 
274-277 280-281 285 
307-310 313-314 316 
330 333-334 340-342 
354-358 370-371 380 
400 405 408-410 412 
425 430-431 434-435 
442 445-451 453-454 
464 468-472 474-479 
487-491 496 499-501 
513 516-519 522 526 
534 536-540 542 547 
566-567 571 574-577 
586 589 593 595-597 
606-607 611-613 615 
629 633 636-637 642 
659-660 662-665 667 
678 682-684 692-696 
708 710 716-720 725 
738-739 743-746 749 
759 765-766 768 770 
786 788-790 793 796 
803 810-811 814 817 
830 832 834-836 838 
863-864 866-871 877 
894-896 898 902 904 
925 927 930-932 935 
945 948-949 953 955 
952 964 967 970-971 
985-990 992-993 995 
1004-1009 1011 1014 
1022-1023 1025 1027 
1033-1036 1038 1041 
1050 1053-1054 1058 
1062 1064 106B 1070 
1085-1086 1089-1091 
1106-1107 1110-1113 
1122-1123 1125 1129 
1135-1137 1140-1145 
1163 1160 1170-1174 
1180 1182-1183 1186 
1200 1202 1205-1206 
1219-1221 1223-1227 
1238-1242 1247 1252 
1258 1261-1262 1264 
1270 1272-1275 1277 
12B7-1293 1299-1300 
1312-1313 1317-1320 
1330 1333-1335 1339 
1347 1349 1353-1357 
1365-1367 1369-1370 
1377 1379-1381 1386 
1400 1403 1409 1419 
1428 1430-1431 1433 
1438 1440-1442 1446 



20-21 24-25 
5 48 50-S1 
5 79-80 82-83 
00 103-104 
123 125-128 
151 153 155 
-172 174 177- 

203-207 210 
-223 229 235- 

260 262 272 
-286 297-301 
-317 321 325- 
348-349 352 
385 387-388 
414-416 421- 
437 439 441- 
456 459 461- 
481 483-485 
503-504 509- 
-527 529-531 
-549 553-559 
579 582 584- 
601-602 604 
-621 623 627- 
644-650 655 
669 674-675 
698 700 706 
-726 729-736 
751 753 756 
-778 780 784- 
798 800 802- 
819 826 828- 
843 845-860 
879 881-892 
-914 916 919- 
936 941-942 
956 958 960- 
973 975 977 
-996 999-1002 
1017-1019 
1029-1031 
1043 1047 
-1059 1061- 
1072 1078 
1093 1097 
1115-1117 
1132-1133 
1152 1158 
1176-1178 
1195 1198- 
1211 1216 
1230-1236 
1254 1256 
-1265 1269- 
1280-1284 
1306 1308 
1322 1324- 
1341 1343- 
1359-1361 
1373-1374 
-1387 1394 

1423 1425- 
-1434 1437- 
1448 1450 
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Tissue Origan 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1453 


1458- 


1459 


1463- 


1464 


1468 


1470- 


1471 


1474 


1477- 


1478 


14B2- 


1488 


1490- 


1493 


1496- 


1501 


1504 


1506 


1509 


1512- 


-1513 


1516 


1519 


1521- 


1522 


1524- 


1525 


1527- 


•1528 


1531 


1534 


1538 


1541 


1545- 


•1547 


1549- 


•1550 


1553 


1555- 


1556 


1560 


1565 


1567 


1575 


1580 


1589 


1591 


1594 


1596 


1598 


1600- 


•1602 


1606- 


1608 


1611 


1614 


1620- 


-1621 


1624 


1626- 


•1629 


163}.- 


-1632 


1636 


1638- 


1639 


1641 


1644- 


-1645 


1648- 


-1650 


1653- 


-1655 


1658 


-1660 


1662 


1669- 


1670 


1675 


-1679 


1684- 


-1688 


1690- 


1692 


1696 


1700 


1702 


1707 


-1709 


1711 


1716 


-1717 


1720 


1723 


1725- 


1727 


1733 


1737 


-1738 


1741 


1743- 


1744 


1748 


-1749 


1752 


1755 


1760- 


1762 


1765 


1769 


1771 


-1772 


17B1- 


1784 


1786 











4 35-36 44-45 61 68-69 75 82 102 
119 139 154 179 197 244 280-281 
324 372 404 430-431 455 461 476- 
477 481 503 537-540 554 575-576 
581 589 608-609 621-622 624 630 
632 647 662-664 669 679 698 764 
773 775-777 802 848 851 856-857 
879 905-907 915 949 952 990 992 
1002 1113 1119 1170 1183 1216 
1236-1237 1241 1275 1346 1353 
1357 1359 1377 1506 1515 1534 
1553 1591 1600 1613-1614 1621 
1628 1670 1676-1677 1691-1692 

1699 1733 1738 1772 

25 35-36 43 80 104 126 128 150 
163 166 188-189 197 210 215 220 
271 277 2B0-281 310 317 336-338 
345 351 372 380-381 383 387 412 
415-416 430 445 448 454 456 467 
481 490 499 503 526 528 546 546 
567 575-576 588 601 613 615 647 
660 665 734-735 737 759 778 787 
790 800 832 845 856 859 869 878 
883 887 905 914 932 934 958 976 
985 990 992 999-1000 1025 1031 
1038 1050 1055 1068 1074 1088 
1099-1102 1107 1136-1138 1149 
1156 1163 1172 1190 1195 1200 
1214-1215 1217 1226-1227 1235 
1238-1239 1244 1253 1278'l280 
1293 1311 1320 1330 1334-1335 
1345 1355 1367 1386-1387 1394 
1403 1406 1414 1423 1437 1442 
1465 1521 1529 1536 1539 1541 
1547-1548 1582 1620 1626 1631 
1638 1647 1653 1660 1667 1669- 
1670 1680-1681 1696 1704 1715 
1724-1725 1731-1732 1750 1760- 

1761 

5-8 10 12 14-18 20-21 24-25 29 
33-39 42-43 52 55-58 60-64 68-69 
71 73-74 79-80 82 89 98 100 103 
106 108 112 123 128 133-137 144- 
146 148 150-152 154 158-159 165- 
166 170-172 174 176 178 181-185 
188-190 194-196 201-206 210 217- 
222 224 227-228 231 233-237 247 
251 253-254 256 261-263 266-267 
271 276-277 279-281 284-286 288 



leukocyte 



Clontech 



LUC0 03 



melanoma from 
cell line ATCC 
#CRL 1424 



Clontech 



MEL004 



mammary gland 



Invitrogen 



MKG001 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



290 297 299 301 304 
320-321 323-325 327- 
334 339 341 344-345 
359-360 362-363 368 
383 3B8 390 393-395 
408 412 414-415 423 
441-444 448 451-455 
476 479 482 485-486 
495 498 503 506 509- 
519-520 522 527 529 
547 549 554 557 562 
589-591 597 602 607 
629 632 634-640 644 
652 655 657-658 660 
672 674-676 679 682 
706-707 710 713 717 
732-734 736 738 743 
755 759 761 7G6 770 
789 794 803 806-807 
822 827-829 837 842 
■864 866 869-870 872 
893-900 904 906-907 
921-923 926 935-937 
953-954 957 960-961 
970 977-978 984-989 
1000-1001 1005-1006 
1014 1016-1017 1023 
1032-1033 1036 1039 
1055 1057-105B 1063 
1077-107B 1085 1087 
1095-1102 1107-1108 
1121-1123 1131-1133 
1139-1142 1144-1145 
1153 1159 1167 1170 
1183-1185 1190-1192 
1207-1208 1212 1216 
1223 1225 1231 1234 
1247 1253-1254 1258 
1262 1270-1280 1283 
1298 1307 1314 1316 
1325 1330 1334-1335 
1349-1352 1354-1355 
1370 1377 1379 1381 
1389 1405 1414 1419 
1425-1426 1428-1429 
1437 1439 1448-1449 
1460-1464 1466 1471 
1487 1489-1491 1493 
1512 1519 1526-1528 
1536 1539 1542 1547 
1554 1561-1562 1564 
1576-1579 1581-1582 
1592 1594 1596-1597 
1607-1608 1610 1612 
1621-1622 1625-1626 
1636 1641 1643-1644 
1652 1654-1655 1657 
1662 1664-1666 1669 
1674 1676-1677 1680 
1692 1701 1706 1713 
1720 1723-1728 1730 
1740 1742-1744 1746 
1751 1753 1760-1762 
1771 1774 1776-1777 
1784 1786 



309-312 318 
329 331-332 
348 350 356 
371 376 379- 
397-398 405 
430 434-437 
462-464 474 
488 490 494- 
■512 516-517 
534 537-541 
572-574 587 
618 623 628- 
647-648 650- 
665 667 669- 
688 695-696 
720 722-730 
747-748 750 
780 784 786- 
809 814 B17- 
854-858 863- 
878 881 889 
911 916 919 
946 948-949 
963 965-966 
993-997 
1008 1013- 
1025 1027 
1043 1045 
1068-1075 
1089-1091 
1112-1119 
1136-1137 
1148-1149 
1172-1173 
1196-1199 
-1218 1222- 

1240-1241 
-1259 1261- 

1285-1286 
-1320 1323- 
1342-1345 
1359 1369- 
1383-1384 
1421-1423 
1431 1434- 
1454 1457 
1480-1483 
1505 1507 
1532 1534 
1549-1550 
1567 1572 
1587-1588 
1601-1602 
-1616 1618 
1631 1635- 
1647 1650 
-1658 1660 
-1671 1673- 
-1685 1689- 
-1715 1719- 
-1732 1738 
-1747 1749 
1765-1768 
1779 1783- 



induced neuron 
cells 



Strategene 



NTD001 



29 35-36 80 116 123 

214 230 280-281 284 

330 340 358 371 375 

422 424 492 497 532 



156 163 181 

285 307 321 

377 380 382 

-533 542 546 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



549 566 5B6 595 612 
734 775-778 780 792 
856 858 875 936 953 
2041-1043 1055 1072 
1194 1206 1223 1246 
1288-1289 1291 1294 
1349 1359 1412 1423 
1623 1645 1684 1705 



645-647 654 
799 821 826 
985 990 992 
1104 1193- 
1253 1274 
1311 1320 
1485 1620 
1715 1751 



retinoid acid 

induced 
neuronal cells 



Strategene 



NTR0 01 



neuronal cells 



5-8 78 268-269 277 383 431 506 
623 677 731 999-1000 1199 1425- 
1426 1547 



Strategene 



NTU001 



29 65-66 80 82 110 119 146 152 
166 174 181-185 198 227-228 253 
284 309 325 332 334 336-338 375 
391 393 406 414-416 454 465-466 
470 488 503 506 510-512 519 537- 
540 572-574 597 602 607 623 647 
661 700 702 716 743 771 792 858 
904 948 954 977 1000 1005-1006 
1025 1064 1068 1122 1148 1185 
1219 1226 1234 1246 1271 1283 
1295-1296 1311 1317-1320 1329- 
1330 1350 1355 1365-1366 1378 
1383-1384 1400 1412 1445 1505 
1539 1547 1578 1647 1656 1683 
1690 1738 1749 1783-1784 



pituitary 
gland 



Clontech 



PIT004 



311 314 379 408 419 430 454 1055 

1095-1096 1272-1273 1312 1320 

1378 1652 1671 1720 1725 1736 
1741 1755 



placenta 



Clontech 



PLA003 | 5-8 124 208 277 370 843 906-907 

1280 1317-1319 1369 1609 1621 
1737 



prostate 



Clontech 



PRT001 



9 46 57 71 107 147 171 177 197 
201 22S 231 242-243 274 280-281 
307 310 317 330 358 373 382-383 
400 430 434-436 461-462 469 477 
489 497 500 505-506 513 521 526 
531-533 547 618 649 657-558 662- 
664 710 729 767 771 789 820 861 
871 874 890-891 905 938 945 963- 
964 988-989 1002 1025 1033 1045 
1061 1095-1096 1112 1125 1142 
1196 1198 1202 1232-1233 1241 
1258 1272-1273 1287 1295 1313 
1333 1341 1344 2349 1360 1362- 
1363 1367 1437 1442 1447 1475 
1478-1479 1482 1489 1513 1517 
1527 1531 1536 1598-1599 1628 
1636 1657 1680-1681 1687-1688 
1717 1738 1743-1744 



rectum 



Invitrogen 



REC001 



17-18 29 33 62-63 71 73-74 83 86 
113 126 146 153 158 167-169 195 
200 206 261 309 312 341 344 368 
373 388 395 408 414 420 430 441- 
442 446 448 464 468 483 517 537- 
540 547 567 585 589 602 623 628- 
629 632 645-647 651 657-658 669 
717-719 721 725-726 738 748 750 
756 762-7S3 766 770 774 790 819 
825 843 849 851 881 903 909 948- 
949 960 986 996 1020 1023 1033- 
1034 1064 1067 1070 1075 1086 
1108-1109 1113 1130 1139 1153 
1159 1172 1178 1185 1187-1189 
1205 1220 1225 1240 1244 1271 
1317-1320 1323 1334-1335 1350- 
1351 1355 1369 1373 1375 1425- 
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Tieeue Origin 


RNA Source 


Hyseq 


SEQ ID NOS: 






Library Name 










1426 1436 1439 1469 1474 1477 








1482 1546 1587-158B 1592 1596 








1610 1622 1627 1644 1658 1662 








1665-1666 1669 1675-1677 1749 








1766 


salivary gland 


Clontech 


SAL001 


10 55 97 103 110 140 149 152 158 






198 217-218 242-243 256 301 308 








312 321 333 351 354 360 410 437 








448 473 487 494 496 501 535 555 








569-570 572-573 590-591 624 636 








651 759 762 764 768 771 788 800 








809 826 848 865 879 906-907 925 








933 963 1016 1020 1025 1040 1046 








1055 1066 1103 1150 1172 1181 








1234 1281-1282 1288-1289 1298 








1315 1320 1333 1336-1337 1346 








1359 1373 1379 1424 1447 1449 








1474 14B2 1492 1494 149B lbll 








1523-1524 1537 1554 1596 1626- 








1627 1636 1652-1655 1658 1665 








1671-1672 1691-1692 


salivary gland 


Clontech 


SALs03 


158 326 1423 1463-1464 


skin 


ATCC 


SFB001 


1320 1400 


fibroblast 








skin 


ATCC 


SFB002 


262 736 1025 1253 


fibroblast 








skin 


ATCC 


SFB003 


709 1119 1350 1631 1653 


fibroblast 








small 


Clontech 


SIN001 


25 142 146-147 151 155 198 203 


intestine 






244 260 271 280-281 286 288 298 








301-302 308 312 334 340 371 398 








408 412 414 416 423 426-427 430 








434-435 445 452 454 478 503 516 








519 521 523 543 547 549 555 559 








563 569-570 585 592 604 611 626 








628-629 632 650 659 681 710 714 








718 750 764 780 798 829 842 857 








859 866 887 892 894-895 901 904 








906-907 912 919 935 997-998 1000 








1007-1008 1026-1028 1044 1055 








1089 1097 1116-1117 1131 1148 








1169 1199 1219 1234 1247 1264 








1279 1316 1320 1326 1341 1343 








1349 1351 1374 1387 1398 1400 








1403 1407 1423 1428 1468 1498 








1501 1521 1550 1556 1585 1597 








1636 1638-1639 1645 1653 1656 








1662 1671 1675 1684 1691-1692 








1704 1711 1717 1719 1722 1725- 








1726 1729 1733-1734 1743-1744 








1762 1767 1780 1785 


skeletal 


Clontech 


SKM001 


18 20-21 82 84 101 118 134 148 


muscle 






151 153 166 225-226 258 274 277 








289 329 361 412 414 424 440 452 








459 470 488 bUo-bU4 bJ /-b4U b4 / 








660 673-675 715 773 780 786 830 








one O'n ocn qci qrq oon ooo inort 
yUb ybu 7oJ 7D£ jjU y y c, JLUzU 








1047 1063 1115-1117 1121 1134 








1228 1268 1284 1298 1321 1329 








1336-1337 1343 1409 1413-1414 








1509 1599 1624 1644 1653 1712 


skeletal 


Clontech 


SKM002 


168 1683 1712 


muscle 








skeletal 


Clontech 


SKMs03 


235-236 1409 


muscle 








skeletal 


Clontech 


SKMS04 


235-236 


muscle 








spinal cord 


Clontech 


SPC001 


4 9 11 17 30-31 35-36 43 46 60 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



82 85 92 94 108 110 
167 198 204-205 210 
259 277 280-281 300> 
317 372 379 387 392 
430 433 448 467 473 
509 513 519 524 526 
547 549 551 559 567 
607 516-617 623 625 
652 657-658 670-671 
682 709 711 715 719 
749-750 753 775-777 
809 820 832 834-836 
855 858 861 864 871- 
898 906-908 917 919 
944 970 985 990 992 
1039 1053 1059 1065 
1077 1082 1085 1097 
1116-1117 1128 1134 
1174 1192-1194 1215 
1243 1283 1294 1307 
1323 1327 1330 1350 
1356 1359 1368 1375 
1407 1423 1429 1437 
1454 1470 1482 1492 
1511 1529 1538 1548- 
1571 1578 1598 1600 
1627 1630 1639 1646 
1670 1686 1696 1740 
1771 



116 139 157 
215 229 256 
302 304 315 
419 426-427 
487 489 506 
537-540 543 
569-570 593 
637 649-650 
673 679 681- 
726-729 734 
781 789 791 
847-849 854- 
872 875 884 
924 934 942 
993 998 1013 
1072 1075 
1103 1109 
1151 1170 
1225 1241 
1312 1320 
1353-1354 
1400 1406- 
1443 1448 
1501 1508 
1549 1565 
1614 1625 
1651-1652 
1751 1755 



adult spleen 



Clontech 



SPLcOl 



117 312 326 348 424 426-427 431 
845 866 1320 1330 1333 1344 
1355-1357 1371 1387 1397 1446 
1538 1579 1669 1686 1739 1767 



stomach 



Clontech 



STO001 



thalamus 



Clontech 



10 15-16 61 68-69 100 117 149 
197 201 227-22B 231 249 273 280- 
281 287 291-292 302 312 358 362 
426-427 430 446 462 475 479 535 
597 620 630 651 662-664 722 739 
780 782 785 846 919 960 964 966- 
967 976 1008 1012 1032 1042 1063 
1071 1135 1170 1208 1234-1235 
1259 1277 1280-1281 1322 1349 
1359 1369 1449 1468 1474 1478 
1487 1493 1498 1557-1559 1622 
1634 1651 1653 1729 



THA002 



9 11 25 85 87 112 137 146 180 
190 198 206 210 212-213 235-236 
239 261 268-269 279 290 301 325 
333-334 341 351 356 364-365 379 
388 393 396 419-420 441-442 458 
477 483 508 525 531 549 567 606 
608-609 647 681 715 725-727 736 
774 782 784 794 827 883 890-891 
899-900 961 997 999-1001 1004 
1034 1055 1097 1129 1144-1145 
1150-1151 1157 1172-1173 1177 
1193-1194 1208 1220 1249 1280 
1305 1345 1355 1369 1434-1435 
1440-1441 1454 1496 1546 1549 
1562 1572 1578 1590 1594 1613- 
1614 1640 1651-1652 1671 1687- 
1688 1703 1743-1744 1746-1747 
1753 



thymus 



Clontech 



THM001 



44-45 54 57-58 62-64 79 104 123 

126 134 153 193 212-213 218 242- 

243 258 274 277 279 297 301 307 

327 330 333 342 351 358 371 410 

430 445 465-466 468 471 483 4B7 

493 503 506 509 517 526 535 537- 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



540 546 548 554 567 
591 604 612 621 638 
649 656 660 665 670 
728 735 739 746 759 
775-777 780 784-785 
824 826 828 845 851 
B66 870-871 878 884 
900 927 930-931 967 
992 999 1014 1029-1 
1066 1073 1103 1107 
1117 1119 1140-1142 
1172 1177 1195 1206 
1216 1218-1219 1221 
1271 1277 1282 1320 
1367 1369 1383-1384 
1423 1425-1427 1448 
1493 1536 1554 1620 
1649 1654-1655 1661 
1670 1674 1676-1677 
1707 1711 1731-1732 



990 
1059 



584 586 590- 
-640 645-647 
698 710 720 
762 766-767 
800 802 809 
858-859 864 
887 892 899- 
963 986 
030 1033 
1113 1116 
1158 1163 
1209 1213 
1222 1227 
1329 1349 
1417 1419 
1477 1488 
1644 1646 
-1662 1669 
1685-1688 
1737 



thymus 



Clontech 



THMc02 



5-9 15-21 25 33 35-36 43-45 48 
50-51 54-55 60 75 83 87 89 93 
98-100 102 105 112 117 135-137 
141 143 146 157 167 169 192 196 
211 217-219 222 224 229 233 235- 
236 240-241 244 251-252 256 261- 
262 268-269 286 288 290 295 297 
301-302 309-310 315-317 321 324 
327 334 342 350 352-353 360 370- 
373 382 384 400 403 410 414-416 
424 430-431 436 445 454-456 461 
464-467 470 472 474-476 483 488 
497 500 504 506 513 516 519-520 
524 526 530-531 534 537-540 549 
554-555 565-566 569-570 572-573 
575-577 586-587 595 603-604 606 
612 630-632 634 636 647 650 657- 
660 666-667 669 673-675 678 698 
700 703 708 720 725-726 731 738- 
739 743-744 750-753 757 759 763- 
765 767 772-779 787 789-790 798 
800 810 823 829 834-836 841 848 
854-856 859 861 864 870-871 881 
890-891 898 908-909 913 928 933 
941 949 958 961 963 967 969 975 
981 986 988-990 992 999 1007- 
100B 1014 1016 1039 1041 1073- 
1074 1079 1089 1097 1109 1114- 
1117 1122 1131 1140-1141 1144- 
1145 1163 1172 1175-1177 1166 
1196 1198 1206 1211 1216 1220 
1223 1227 1234-1243 1261-1262 
1267 1271 1280-1281 1284 1290 
1308 1317-1320 1322 1324-1325 
1327 1330 1334-1335 1339 1346 
1350-1351 1355 1357 1360 1370 
1374 1377-1379 1386 1389-1390 
1392 1397 1400 1402 1406-1407 
1417 1423 1425-1427 1440-1441 
1466 1474 1477 1483 1493 1498 
1504 1506 1525 1536 1545 1549 
1566 1594 1598-1600 1608 1611 
1614 1621 1623 1625 1632 1639 
1641 1644 1647 1649 1653-1656 
1658 1662-1663 1671 1673 1678- 
1681 1686-1688 1693 1705 1707 
1711 1717-1718 1726-1727 1731- 
1733 1737-1738 1743-1745 1758- 
1761 1771-1772 1779 1786 



140 



WO 01/53312 



PCT/USOO/34263 



Tissue Origin 
thyroid gland 



RNA Source 
Clontech 



Hyseq 
Library Name 



SEQ ID NOS: 



THRO 01 



4 9-10 20-21 37-39 48 50-51 54- 
57 60-61 65-66 71 83 94-96 98- 
100 102 104 110 112 115-117 119 
123 127 133 136-137 140 149 152- 
153 155-158 163-164 168-169 171 
186 190-192 197 201-203 219-220 
229 233-237 246-247 253 256 258 
262 265-266 268-269 277 280-281 
284-286 288-289 298-299 302 309- 
311 317 321 326 332 335 341-342 
344 348 350 354 358-359 363 368 
371-373 382-383 385 394 398 400- 
401 411 414-415 421 424 430-431 
433-436 443-446 450-452 454-455 
458 472-474 476-478 482 484-485 
487-488 490-494 496-497 500-501 
503-504 506 509-513 516-517 519 
524 526-527 529 535-540 547 549 
562 564 569-570 575-576 588 594- 
595 601-602 604 606 610 612 615- 
617 619-623 628-630 634-635 642 
647 649-651 660 662-665 668 670 
681 690-694 696 698 700 709 721 
727-729 732 734 738 740-741 743 
745 750 759 761 763 765 770 773 
780 785 795-796 798 802 804 823- 
824 826 828 833 838 841-845 847 
849 857-860 867 874-875 878 880- 
881 887-888 890-892 894-895 898 
908 910-911 913-914 922-923 926- 
927 929 932-934 937 939 941-942 
948 953 957 961 963-964 966 978- 
979 981-982 987 990 992 1001 
1004-1006 1010 1014 1020 1024 
1033 1038-1039 1044 1047 1050 
1052-1054 1056 1058 1068 1070- 
1071 1077-1079 1088 1094-1097 
1105-1106 1112-1113 1116-1117 
1124 1126 1128-1129 1131 1134 
1136-1137 1142-1143 1146-1147 
1149-1150 1156 1161-1164 1167 
1170-1173 1177-1181 1190 1192 
1197 1200 1204 1208-1209 1214 
1217 1219 1222 1230 1232-1233 
1235 1241 1245 1247 1254 1257- 
1258 1260 1262 1271-1273 1283 
1286-1289 1299 1306 1314 1320 
1330-1332 1334-1335 1342 1345 
1349 1365-1367 1370-1372 1374 
1381 1394 1407 1419 14281436- 
1437 1440-1441 1443 1446-1449 
1454 1459 1461-1462 1468 1470- 
1471 1475 1477 1479 1482 1491 
1497-1498 1504-1505 1507 1513 
1522 1524-1526 1528 1531 1534 
1536-1537 1548 1550 1553 1555- 
1559 1562 1567 1578 1590-1591 
1597 1599-1601 1612 1614 1616 
1619-1620 1622 1624-1626 1628 
1631-1632 1634 1636 1639 1644- 
1645 1648 1651 1653-1656 1658 
1660 1662-1663 1667 1669 1671 
1675 1678-1681 16B3-1686 1689 
1691-1692 1703 1709-1711 1717 
1724-1726 1729 1734 1737-1738 
1740 1743-1744 1749 1753 1759- 
1761 1770 1777 1786 



trachea 



Clontech 



TRC031 



9 29-31 46 48 87 104 107 110 135 
158 222 262 266 286 301 318 331 
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1 ibdUc unyiii 


RNA Source 


Hyseq 






SEQ 


ID NOS: 








Library Name 




















352 


372 31 


7 384 


414 


424 


445-446 








454 


472 474 491 


496 


560 


579 588 








593 


597 607 612 


626 


681 


702 719 








610 


859 86 


6 878 


894- 


895 


912 916 








922 


932 935 1046 1075 10 


80 1099- 








1102 


1113 


1208 


1215 


1232 


-1233 








1237 


1281 


1312 


1385 


1387 


1405 








1414 


1424 


1430 


1437 


1447 


1505 








1569 


1579 


1586 


1600 


1641 


1653 








1667 


1671 


1676- 


1677 


1683 


1691- 








1692 


1711 


1717 


1726 


1772 




E s 

u e 


Clont ech 


UTR001 


17 19 25 41 46 


57-58 61 


89 104 








108 


139 152 174 


198 


200- 


201 206 








263- 


265 274 290 


387 


408 


420 438 








446 


448 452 473 


491 


493 


499 503 








S06 


513 519 522 


526 


530 


542-543 








560 


601 610 632 


659 


665 


720 751 








773 


780 833 845 


857 


872 


877 912 








929 


934 937 996 


1009-101 


1 1018 








1050 


1075 


1107 


1124 


1170 


1219 








1258 


1279 


1287 


1310 


1320 


1323 








1343 


-1344 


1375 


1437 


1451 


-1452 








1478 


1481 


1498 


1519 


1521 


1536 








1552 


1579 


1597 


1602 


1606 


1620 








1626 


-1627 


1649 


1652 


1661 


1670 








1719 


1722 


-1723 









TRADOCS : 1416191.1 (%CQN0 1 ! DOC) 
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SEQ 
±u 
NO : 




CDC/" T T C 


JJnoL-Kl V 1 1UIM 


SCORE 


A ULW 1111 


1 


Y41736 


Homo 
sapiens 


Human PR01114 protein 
sequence • 


1398 


100 


2 


Y66656 


Homo 
sapiens 


Membrane -bound protein 
PR0943 . 


2389 


99 


3 


AF113136 


Homo sapiens 


IL-1 receptor-associated- 
kinase-M; IRAK-M 


3043 


100 




\j x /duo 






6351 


77 


5 


X02761 


Homo sapiens 


fibronectin precursor 


1D535 


98 


6 


X02761 


Homo sapiens 


fibronectin precursor 


8990 


89' 


9 


X02761 


Homo sapiens 


fibronectin precursor 


12564 


99 


9 


AJ011679 


Homo sapiens 


Rai>6 GTPase activating 
protein, GAPCenA 


5251 


99 


10 


W88501 


Homo sapiens 


Human stomach carcinoma clone 
HP104 15 -encoded protein. 


2381 


100 


11 


AF117754 


Homo sapiens 


thyroid hormone receptor- 
associated protein complex 
component TRAP240 


11336 


98 


12 


Z97630 


Homo sapiens 


dJ466N1.4 (novel protein 
similar to ANK3 (anxyrin j, 
node of Ranvier (ankyrin 

G; J ) 


896 


100 


13 


Y58620 


Homo sapiens 


Protein regulating gene 
expression PRGE-13 . 


1894 


98 


14 


AF213457 


Homo 
sapiens 


triggering receptor expressed 
on myeloid cells 2 


1238 


100 


1* 


AF233453 


Homo sapiens 


RACK-like protein PRKCBP1 


3124 


99 


17 


AF201303 


Homo sapiens 


dhfr oribeta- binding protein 
RIP60 


3 130 


98 


18 


AF064205 


Homo sapiens 


dynactin 1 pl50 isoform 


6377 


100 


19 


U00059 


Saccharomyce 
s cerevisiae 


Yhrl21vp 


174 


26 


20 


AB032903 


Homo sapiens 


guanos ine monophosphate 
reductase isolog 


1801 


99 


21 


ABO 3 2 903 


Homo sapiens 


guanosine monophosphate 
reductase isolog 


14 85 


99 






— — 3 

Homo sapiens 


Ca2+/ calmodul in- dependent 
protein kinase kinase beta 


"X 0 ft 




23 


AF140507 


Homo sapiens 


Ca2+/calmodul in -dependent 

-r-\ Y-r-\+~~ a -I n V ~i r"i 3 c A V~ 1 ri3 co Kaha 

piOtclil Jvlllabc jJCLd 


2300 


99 


24 


AJ289131 


Homo sapiens 


chondroitin 4-0- 
sulf o trans f e rase 


2211 


99 


25 


U33460 


Homo 
sapiens 


DNA- directed RNA polymerase 
1/ largest subumt 


8777 


98 


26 


Y444B8 


Homo sapiens 


ACRP30R2 variant protein. 


1387 


100 




U4 3 701 


Homo sapiens 


ribosomal protein L23a 


791 


100 


28 


U02032 


Homo sapiens 


ribosomal protein L23a 


767 


97 


29 


Y41324 


Homo sapiens 


Human secreted protein 
encoded by gene 17 clone 
HNFIY77. 


1083 


99 


in 




tiouio Sapiens 


nunian UDiquiLin conjugaciun 
system protein 2. 


/ID 


an 


j j. 


W71 *7AQ 


fiomo sapiens 


Human ubiquitm conjugation 
system protein 2 . 


fill ' 






a t?o "lion 


Homo sapiens 


long-chain 2 -hydroxy acid 
oxidase HA0X2 


loll 


100 


33 


229481 




dioxygenase 


1507 


99 ' " 


34 


AB001451 


Homo sapiens 


Sck 


2869 


100 1 


35 


Y00644 


Homo sapiens 


precursor polypeptide (AA -34 
to 287) 


1667 


99 


36 


Y00644 


Homo sapiens 


precursor polypeptide (AA -34 
to 287) 


1104 


98 


37 


Y78795 


Homo sapiens 


Human antifcuai-2 (AZ-2) amino 
acid sequence. 


3586 


78 


38 


Y78795 


Homo sapiens 


Human antizuai-2 (AZ-2) amino 
acid sequence. 


4726 


99 
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TABLE 2 
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SEQ 
ID 
NO : 


ACCESSION 

VTTTMT1 "Ctrt 

NUMBER 


SPECIES 




SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


3 9 


I / 0 / 33 


t-Intnr^ eo^i Ana 

xioitio B&piens 


Hnm-qn ant i 711a i - 2 (AZ-2) amino 
acid sequence. 


3556 


77 


4 0 


VJ J J 1Z J. 




M-phase phosphoprotein-1 


3747 


100 


41 


Y42750 


Homo sapiens 


Human calcium binding protein 


795 


100 


42 




-— , 

Horno sapiens 




1189 


100 


43 


G02150 


Homo sapiens 


Human secreted protein, SEQ 


384 


94 


44 


TIT QCl *7 


nus uiuscuxus 


Elf-1 


2724 


88 


45 


U1961 7 


Mus musculus 


El£-1 


20^2 


86 


46 


AJ? 1ULJ /3D 


Homo sapiens 


nai-pni nHiir^i VP f BftTn* OTP 

ostcoinauciivc \j ±. i. 


1538 


100 


47 


Y87591 


Homo sapiens 


Human SPROUTY-1 protein, SEQ 
ID NO: 24. 


1737 


99 


49 


X04145 | 


Homo sapiens 


ij gamma precurBor \aa -zz lu 
160) 


"942 


99 


51 


X63547 


Homo sapiens 


oncogene 


j at o 




52 


M94043 


Rattus 
norvegicus 


rab-related GTP-binding 
protein 


1089 


96 


53 


L31783 


Mus mus cuius 


uridine kinase 


917 


7"1 


54 


X83973 


Homo sapiens 


transcription factor 


A A Q C 


Q Q 
J O 


55 


AF224741 


Homo sapiens 


chloride channel protein 7 


4128 


99 


55 


W74805 


Homo sapiens 


Human secreted protein 
encoded by gene 77 clone 
HOEAS24 . 


14 91 


100 


57 


Z50907 


Homo sapiens 


Human TBC-1 cDNA from second 
transcript . 


4824 


100 


58 


D79994 


Homo sapiens 


similar to ankyrin of 
Chromatium vinosum. 


60B9 


99 


59 


D79994 


Homo sapiens 


similar to ankyrin of 
Chromatium vinosum. 


Aril A 




60 


Y59738 


Homo sapiens 


Human normal ovarian tissue 
derived protein 15 . 


cm 
bill 


i nn 


61 


AB031069 


Homo sapiens 


protein containing caal. 
domain 1 




i nn 

LUU 


62 " 


Y66660 


Homo 
sapiens 


Membrane -bound protein 

PROVoJ . 


2492 


99" 


63 


Y66660 


Homo 
sapiens 


Membrane -bound protein 

PKO/oJ . 


1709 


99 


64 


S70011 


Rattus sp. 


tricarboxylate carrier 


895 


55 


65 


AF139518 


Rattus 
norvegicus 


A-kinase anchor protein 


178 


24 


66 


W29666 


Homo sapiens 


Homo sapiens DH1308_1 clone 
secreted protein. 






67 


AJ24573 8 


Homo sapiens 


claudin-15 


1206 


100 


68 


AF0991 3 8 


Rattus 
norvegicus 


GLUT4 vesicle protein 


4183 


87 " 


69 


AF099138 


Rattus 
norvegicus 


GLUT4 vesicie protein 


4906 


86 


70 


Z82059 


Caenorhabdit 
is elegans 


olmiiariLy to urosopniia iniy 
canal protein comes from 

LllXb yci 1C 


1285 


44 


71 


* 17*5*5 >I*)*)Q 


Homo sapiens 




1282 


100 


72 


AF126426 


Homo sapiens 


neurotrimin 


1809 


100 


73 


Y41652 


Homo 
sapiens 


Human MEK2 protein sequence. 


2065 


99 


I *i 




sapiens 


Human MEK2 protein sequence . 


1207 


100 


75 


AF188622 


Mus mus cuius 


selectively expressed in 
embryonic epithelia protein- 1 


14B5 


74 


76 


AE000406 


Escherichia 
coli 


putative DNA topoisomerase 


950 


100 


11 


X99302 


Homo sapiens 


Popl 


655 


100 


78 


AL136538 


Schizosaccha 

romyces 

pombe 


similarity to S. cerevisiae 
ktil2 protein 


210 


31 


1 79 


AF129756 


Homo sapiens 


G4 


1554 


99 
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TABLE 2 
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SEQ 

ID 

tjh > 
xvu . 


ACCESSION 

VTT TV* O T? T3 


SPECIES 


DESCRIPTION 


SMITH- 

UATCDUAM 

oLUKt 


% 

IDKNTITY 


8 0 


/VJ-lU 3D / OD 


U*~itYi/^ nani one 
nmtiu aapiciio 


dJ858Bl6 2 
(phosphatidyl serine 

4.1.1. 65) ) 


203 3 


ivu 


81 


AL096768 


Homo sapiens 


dJ858Bl6 . 2 
(phosphat i dyl ser i ne 
decarboxylase (PSSC, EC 
4.1.1.65) ) 


1220 


96 


82 


XS7351 


Homo sapiens 


1-8D 


677 


98 


83 


AC005 c i94 


Homo sapiens 


R26984 1 


2700 


98 


84 


X73 113 


Hnmo can! pnq 


fasr MvBP-C 


5959 


99 


8 5 


rlT \J ^ t ^ J U 


\J c~\TT\t~\ cam pn c 


CLIC4 


1305 


9 9 


86 


AB01B423 


Mus mus cuius 


SH2 domain- containing protein 


1360 


78 


8 7 




Homo Ba.pi.sns 






Q O 


88 


AF196329 


Homo 
sapiens 


triggering receptor expressed 
on monocytes 1 


1214 


100 


8 9 


AB016B79 


Arabidopsis 
thaliana 


contains similarity to pre- 
mRNA splicing 
caccor—gene^iu : i*ikc i / . z 


634 


3 6 


SO 


AJ133721 


Mus musculus 


homeodomain protein 


654 


57 


91 


Ad 28d4 


Mus musculus 


phtf protein 


619 


61 


92 


A61971 


unident if ied 


MCSP 


11676 


99 


93 


Y99365 
- 


Homo sapiens 


Human PRO1250 (UNQ633) amino 
acio sequence oEQ id nl?:86. 


3890 


100 


y 4 


Y87231 


Homo sapiens 


Human signal peptide 
containing protein HSPP-8 


1031 


100 


95 


AF227741 


Rattus 
norvegicus 


protein kinase WNK1 


2428 


95 




a i?") mii 
Ar 2 ^ / / 4 1 


Ra t tus 

no rveg i cu s 


protein kinase WNKl 




94 


97 


Y92513 


Homo sapiens 


Human OXRE-10. 


1626 


100 


QR 


AT mi 7 c c 


Homo sapiens 


ciLivu / z x\i , o iMnesin reiacea 


3 423 


100 


9 9 


t\\- UUj/tlJ 






J. 3 / % 


QQ 


100 


Y95293 


Homo sapiens 


Human GEP containing NEK- like 


4092 


99 


101 


AL118501 


Homo sapiens 


dJ1191N16.1 (A novel protein 
(translation of the cDNA 


1509 


100 


102 


AJ006267 


Homo sapiens 


ClpX-like protein 


3233 


100 


103 


atji n f\1 C 1 


Homo sapiens 


ancient ubiguitous 46 kDa 

piOLcIII Aurl 


2042 


96 


104 


AROi 59R? 
now ^37d« 




«a a v ^ "n ^ / t*V» y-»/-iyi {na if i na Qp 
DCL lUC/ L11ICU111UC A- XI luat. 


A 7 1 fl 


i nn 

1UU 


1 05 


API 510 74 


J-7 r~\mr^ caf*^ pne 




831 


CA 


106 


M35522 


Canis 


OTP-binding protein (rab7) 


354 


50 


107 


R99800 


Homo sapiens 


NTII-1 nerve protein, 

f- j» /— -i 1 i i- > hoc rpopnprfl t~ i on rrf 

nerve cells. 


2337 


93 


108 


AF12 553 3 


Homo canl pnq 
xilj Likw sapiens 


isof orm 


12 90 


93 


109 


AC005^14 


Homo sapiens 


F23269 2 


33 69 


99 


110 


AF064729 


Homo sapiens 


RAl 1 ? binding protein 16 


3 285 


100 


111 


X52425 


Homo sapiens 


interleukin 4 receptor 


4496 


100 


112 


Y41686 


Homo 
sapiens 


Human PR0274 protein 
sequence . 


2285 


100 


113 


W15S06 


Homo sapiens 


Mitogen activating protein 
kinase ERK1 . 


1991 


100 


114 


Y71071 


Homo sapiens 


Human membrane transport 
protein, MTRP-16. 


1190 


99 


115 


AL049548 


Homo sapiens 


dJ398G3.1 (ortholog of rat 
CPG2) 


3497 


99 


116 


AF189817 


Mus musculus 


evect in-2 


1124 


90 


117 


W30891 


Homo 


Human cytostatin ill protein. 


715 


99 
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SCORE 


% 

IDENTITY 






sapiens 








118 


AF11661B 


Homo sapiens 




1469 


100 


119 


Y08915 


Homo sapiens 


alpha 4 protein 


1748 


100 


120 


AF098070 


Drosophila 
melanogaster 


Li si honiolog 


192 


3 9 


121 


AF052432 


Homo sapiens 


katanin p8 0 subunit 


181 


37 


122 


Y70743 


Homo sapiens 


PSEQ-1 protein encoded by 
NSEQ gene associated with 
matrix remodelling. 


2637 


98 


123 


AF083246 


Homo sapiens 


HSPC028 


0 1 19 


XvU 


124 


Y27096 


Homo sapiens 


Human viral receptor protein 
(ACVRP) . 


oJJ 


Q O 


125 


M63109 


Leishmania 
major 


glycoprotein 96-92 


1 1 Z. 


z / 


126 


U75467 


Drosophila 
melanogaster 


Atu 






127 


Z68220 


Caenorhabdit 
is elegans 


Similarity to Human ADP/ATP 
carrier protein 


A 1 Q 




128 


AF09S927 


Rattus 
norvegicus 


protein phosphatase 2C 


1927 


94 


129 


W92958 


Homo sapiens 


Human zsig44 protein. 


4 £3 


100 


130 


AF115391 


Lactobacilli! 
s eakei 


ribokinaee RbsK 


508 


37 


131 


X93498 


Homo sapiens 


21-Glutamic Acid-Rich Protein 


1250 


100 


132 


X93498 


Homo sapiens 


21-Glutamic Acid-Rich Protein 


916 


87 


133 


W52B11 


Homo sapiens 


Human DBI/ACBP -like protein 
(DBIH) . 


705 


97 


134 


Y84444 


Homo sapiens 


Amino acid sequence of a 
human RNA- associated 
protein. 

c 1 =r ' 


3230 


100 


135 


M6 9181 


Homo sapiens 


non -muscle myosin B 


189 


20 


136 


W74882 


Homo sapiens 


Human secreted protein 
encoded by gene 154 clone 
HE6FL83 . 


480 


100 


137 


W78200 


Homo sapiens 


Human secreted protein 
encoded by gene 75 clone 
HHGAU81 . 


ore 

Bib 


Q Q 


138 


AL033520 


Homo sapiens 


dJ349A12.1 isimiiar to 
KIAA0701 protein) 


A *> A 


3 9 


139 


AF020261 


Santalum 
album 


proline rich protein 


117 




140 


X70394 


Homo sapiens 


zinc finger protein 


1634 ~ 


100 


141 


Y06439 


Homo sapiens 


Human protease HUPM-8. 


936 


100 


142 


Z68493 


Caenorhabdit 
is elegans 


predicted using Genefinder 


J OS 


A "5 


143 


AB018107 


Arabidopsis 
thaliana 


ADP-ribosylation factor-like 
protein 


596 


65 


144 


AF161483 


Homo sapiens 


HSPC134 


580 


51 


145 


Y84902 


Homo sapienB 


A. human proliferation and 
apoptosis related protein. 


a a 0 


1UU 


146 


AB004906 


Ipomoea 
purpurea 


transposase 


" 1 AC 


on ' 


147 


AC007357 


Arabidopsis 
thaliana 


F3F15 . -l8 


O •* I 


3 1 


148 


W75155 


Homo sapiens 


Human secreted protein 
encoded by gene 41 clone 
HNTME13 . 


1494 


98 


149 


AF056490 


Homo sapiens 


cAMP-specif ic 
phosphodiesterase 8A 


3710 


99 


150 


Y58171 


Homo 
sapiens 


Human hydrolase homologue 
HHH-7 . 


785 


99 


151 


U10397 


Saccharomyce 
s cerevisiae 


Yhrl4 8wp 


515 


53 


152 


X73418 


Homo sapiens 


phosphotyrosyl phosphatase 
activator 


1719 


99 


153 


AL049697 


Homo sapiens 


dJ382I10.5.1 (novel protein 


2034 


99 
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SEQ 
ID 
NO : 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 


\ 

IDENTITY 








similar to arginyl-tRNA) 






154 


AF169802 


Homo sapiens 


cytochrome b5 reductase b5R.2 


14 55 


99 


155 


X94703 


Homo sapiens 


rab2S 


1126 


99 


156 


Y25716 


Homo sapiens 


Human secreted protein 
encoded from gene 6 . 


1471 


100 


15S 


W77404 


Homo sapiens 


Secreted salivary polypeptide 
zsig3 2 . 


937 


100 


159 


Y17248 


Homo sapiens 


Human protein kinase 
inhibitor-2 (PKI-2) . 


383 


100 


160 


J04970 


Homo sapiens 


carboxypeptidase M precursor 


2395 


100 


161 


W54040 


Homo sapiens 


Human interferon-inducible 
protein, HIFI. 


484 


98 


162 


AL022724 


Homo sapiens 


dJ4l3H6.1.1 (hamster 
Androgen- dependent Expressed 
Protein LIKE putative 
protein) (isoform 1) 


1357 


100 


163 


AF125535 


Homo sapiens 


pp21 homolog 


193 


45 


164 


G03632 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7713. 


463 


97 


165 


AJ250839 


Homo sapiens 


serine/ threonine protein 
kinase 


1442 


71 


166 


L09649 


Zymomonas 
mobilis 


zm2 


173 


37 


167 


Y73337 


Homo sapiens 


HTRM clone 1944530 protein 
sequence. 


1204 


100 


168 


W88645 


Homo sapiens 


Secreted protein encoded by 
gene 112 clone HUKFC71 . 


1084 


100 


169 


AF214731 


Homo sapiens 


ATP-dependent RNA hel lease 


4402 


100 


170 


AE000871 


Methanobacte 
rium 

thermoautotr 
ophicum 


conserved protein 


166 


27 


171 


Y27684 


Homo sapiens 


Human secreted protein 
encoded by gene No. 118. 


821 


100 


172 


AF226044 


Homo sapiens 


HSNFRK 
— - — - 


2904 


100 


173 


AJ245946 


Homo sapiens 


neuroglobin 


779 


100 


174 


D43949 


Homo sapiens 


This gene is novel. 


3202 


100 


175 


Y07923 


Homo sapiens 


GTP-binding protein 


1205 


100 


176 


W90338 


Homo 
sapiens 


Human DPI homologrie protein. 


966 


100 


177 


Y41675 


Homo sapiens 


Human channel -related 
molecule HCRM-3 . 


1122 


100 


178 


Y41674 


Homo sapiens 


Human channel -related 
molecule HCRM-2 . 


936 


99 


179 


AF220492 


Komo sapiens 


krueppel-like zinc finger 
protein HZF2 


4100 


99 


180 


X03084 


Komo sapiens 


Clq B- chain precursor 




100 


181 


U57344 


Mus musculus 


Meis 3 


lb 1 J 




183 


U57344 


Mus musculus 


Meis3 


1743 


86 


184 


U57344 


Mus musculus 


Meis3 


1070 


86 


185 


AF033120 


Homo sapiens 


p53 regulated PA26-T2 nuclear 
protein 


13 89 


58 


186 


AF2C0357 


Mus musculus 


pantothenate kinase 1 beta 


1605 


82 


"T87 


W75058 


Homo sapiens 


Human secreted protein 
encoded by gene 2 clone 

HLDBG3 3 . 


1188 


99 


188 


AJ292529 


Homo sapiens 


suppressor of sterile four 1 


2424 


100 


190 


X54134 


Homo sapiens 


protein- tyrosine phosphatase 


3705 


100 


191 


Y22203 


Homo sapiens 


Human calcium-binding 
phosphoprotein, CBPP-l, 
protein sequence. 


1083 


99 


192 


W63692 


Homo 
sapiens 


Human secreted protein 12. 


1975 


100 


193 


W87772 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase {H-SGK2 ) 
polypeptide . 


2605 


99 
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ID 
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DESCRIPTION 


SMITH - 
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SCORE 


% 

IDENTITY 




AF0B4259 


Mas rnusculus 


bromodoma in - conta i n ing 
protein BP75 


693 


54 




1 00 /3Z 


Rattus 
norvegicus 


serine dehydratase (AA 1 - 
327) 


994 


61 


196 


W95349 


Homo sapiens 


Human foetal brain secreted 
protein fh!70 7. 


2596 


100 


197 


AB028859 


Homo sapiens 


hDj9 


1890 


100 


198 

. 


W95633 


Homo sapiens 


Homo sapiens secreted protein 
gene clone hm236 1. 


1614 


100 


199 


144277 


Homo 
sapiens 


Human nucleic acid methylase- 
2. 


2096 


99 


200 


AB030039 


Homo sapiens 


hPACPLl 


2258 


100 


2 01 


X54162 


Homo sapiens 


64 Kd autoantigen 


2918 


99 


202 


G02061 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: £142. 


558 


99 


203 


X13885 


Nicotiana 
t aba cum 


extensin (AA 1-620) 


185 


33 


204 


J04204 


Bos Caurus 


32 kd accessory protein 


1837 


100 


205 


J04204 


Bos taurus 


32 kd accessory protein 


1101 


100 


207 


Y87283 


Homo sapiens 


Human signal peptide 
containing protein HSPP-60 
SEQ ID NO: 60. 


1318 


100 


208 


Y02860 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 65. 


936 


98 


209 


AL121889 


Homo sapiens 


dJ1076E17.1 (KIAA0823 protein 
(continues in AL023803)) 


694 


54 


210 


AF226732 


Homo sapiens 


NPD007 


1345 


76 


211 


X66295 


Mus rnusculus 


Clq C chain 


9"70 


73 


212 


Z29328 


Homo sapiens 


Ubiqui tin- conjugating enzyme 
UbcH2 


966 


100 


213 


Z29328 


Homo sapiens 


Ubiqui tin- conjugating enzyme 
UbcH2 


542 


98 


214 


AJ002030 


Homo sapiens 


progresterone binding protein 


1163 


100 


215 


X70649 


Homo sapiens 


member of DEAD box protein 
family 


3933 


100 


216 


AF250558 


Homo sapiens 


claudin-2 


1169 


99 


217 


AL021453 


Homo sapiens 


dJ821Dll.l (PUTATIVE protein) 


259 


100 


218 


Y08565 


Homo sapiens 


UDP-GalNAc polypeptide N- 
acetylgalactosaminyltransf era 
se 


3331 


f99 


219 


Y94452 


Homo sapiens 


Human inflammation associated 
protein 


2067 


100 


220 


AL035521 


Arabidopsis 
thaliana 


putative protein 


315 


42 


221 


AL031786 


Schizosaccha 

romyces 

pombe 


putative proline-trna 
synthetase 


811 


41 


222 


AL109736 


Schizosaccha 

romyces 

pombe 


WD repeat protein 


626 


40 


223 


X52493 


Glycine max 


DNA-directed RNA polymerase 


136 


23 






Homo sapiens 


dJ979Nl.l (dJ979Nl.l) 


5199 


98 




"turn T/> m 


Mus rnusculus 


mmDj4 


1761 


92 


225 


AB032401 


Mus rnusculus 


mmDj4 


1988 


92 


/.£ 1 


X83502 


Saccharomyce 
s cerevisiae 


J1007 


112 


26 


228 


*835o2 


Saccharomyce 
s cerevisiae 


J1007 ~ 


79 


25 


229 


AF143723 


Homo sapiens 


heat shock protein HSP60 


2557 


99 


230 


Y66677 


Homo 
sapiens 


Membrane -bound protein 
PR082 8. 


982 


100 


231 


AB027466 


Homo sapiens 


spondin 2 


1756 


99 


232 


W95634 


Homo ; 
sapiens 


Homo sapiens secreted 
protein. 


1391 


100 


233 


K00365 


Homo sapiens 


Human eye 1 in Bi . 


221B 


99 


234 


Y53762 


Homo sapiens 


A GTP -binding polypeptide 


1017 


100 
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% 
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JUOil 1111 








designated RAQ. 






235 


Z50749 


Homo sapiens 


yeast sds22 homolog 


1800 


100 


236 


25074 9 




upash ^ri^22 homo! oa 


1754 


98 


237 


AB026491 


Homo sapiens 


P1CK1 


2137 


100 


23 8 


AJ"2 7 0205 


En fc or? iiii nm 

caudatum 


nti (" a t i vf> 

phosphatidylinositol-4 - 
phosphate 5-kinase 


114 


3 7 


239 


AB030189 


Mus musculus 


contains transmembrane (TM) 

rpfi i nn anrf ATP hi rHi nrr t~f>ci ion 


710 


93 


240 


W5653 8 


Homo sapiens 


Human hedgehog interacting 

nrnl-p^T, fHTPi 
piuLciii ynxtr/ . 


3785 


99 


241 


W56538 


Homo sapiens 


Human hedgehog interacting 
protein (HIP) . 


3436 


99 


242 


Ar 155107 


Homo sapiens 


isi-KisN-j / antigen 


996 


99 


243 


AF1S5107 


Homo sapiens 


NY- REN- 3 7 antigen 


1005 


100 


244 


AL03 1320 


Homo sapiens 


dJ2 0N2.1 (novel protein 
similar to yeast and 
bacterial cytosine 
deaminase ) 


763 


99 


245 


U37026 


Rattus 
norvegicus 


sodium channel beta 2 subunit 


162 


30 


246 


AL078599 


Homo sapiens 


dJ991C6.1 (novel protein 
similar to C. elegans 
F55A12.9 (Tr:P910B6) ) 


2391 


98 


24 7 


U32274 


Sa cc ha r omyce 
s cerevisiae 


lurjoowp ; LAl : U . iz 


191 


■a -7 


248 


Y41719 


Homo 
sapiens 


Human PROS 64 protein 
sequence . 


18 79 


100 


24 9 


Ai3U294 J 4 


Homo sapiens 


ghrelin precursor 


611 


10 0 


250 


A3 /O J 1 


Ra 1 t u s 
norvegicus 


carnitine/ acylcarnitine 
carrier protein 


246 


3 8 


251 


W80993 


Homo 
sapiens 


Human RIP- interacting factor 
RIF . 


1724 


100 


252 


Y94873 


Homo 
sapiens 


Human protein clone HP02632. 


1876 


100 


253 


W59878 


Homo sapiens 


Amino acid sequence of the 
cDNA clone AIF-2 (HEBGM49) . 


765 


100 


254 


AL354 53 3 


Lelsnmama 
major 


possible adenylate kinase 


265 


34 


255 


AF233322 


Mus musculus 


zinc transporter like 2 


1916 


95 


256 


Y78113 


Homo sapiens 


Human cytokine signal 
regulator CKSR-1 SEQ ID 
N0:1. 




99 


257 


AL035539 


Arabidopsis 
thai i ana 


putative amino acid transport 
protein 


390 


27 


258 


W747o7 


Homo sapiens 


Human secreted protein 

sncoaea gene so cionc 

WHEWNC1 
nxi c rum ox. 


1171 


100 


259 


AL035689 


Homo sapiens 


dJ18 7Jll.l (novel protein 

Q"imi 1 Jir t*o n ir<~\ h ^ "i t> Vi nnq^ C 

inhibi tors ) 


974 


100 


260 


AE0OG909 


M p t" h a n o h a c y f» 

rium 

therraoautotr 
ophicum 


kinase related protein 


363 




261 


AL050131 


Homo sapiens 


hypothetical protein 


626 


100 


262 


AF019661 


Mus musculus 


zeta proteasome chain; PSMA5 


1214 


100 


263 


AL035593 


Homo sapiens 


dJ310J6.1 (novel protein) 


821 


100 


2^4 


AL022318 


Homo sapiens 


bK150C2.3 (PUTATIVE novel 
protein similar to AP0BEC1) 


1072 


100 


265 


AF205940 


Homo sapiens 


endomucin 


1289 


100 


266 


AL023583 


Homo sapiens 


dJ500L14.1 (novel protein) 


789 


100 


267 


AL034548 


Homo sapiens 


dJH03G7.3 (novel protein 
kinase domains containing 
protein similar to 
phosphoprotein C8FW) 


1888 


99 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


t 

IDENTITY 


268 


AF161470 


Homo sapiens 


HSPC121 


1884 


98 


269 


AF161470 


Homo sapiens 


HSPC121 


1232 


96 


270 


X90763 


Homo 
sapiens 


HHa5 hair keratin type I 
intermediate filament 


2190 


99 


271 


AF20760Q 


Homo sapiens 


ethanolamine kinase 


1952 


100 


272 


M32334 


Homo sapiens 


intercellular adhesion 
molecule 2 


1436 


100 


273 


AF161483 


Homo sapiens 


HSPC134 


663 


61 


274 


Y53C52 


Homo sapiens 


Human secreted protein clone 
df2 02_3 protein sequence SEQ 
ID NO: 110. 


587 


100 


276 


Y77576 


Homo sapiens 


Huraan cytoskeletal protein 
(HCYT) {clone 2195418) 


762 


100 


277 


AF077042 


Homo sapiens 


30S ribosomal protein S7 


1269 


100 


27 8 


Y94907 




cal06 19x protein sequence 
SEQ ID NO: 20. 




y o 


279 


Y68788 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-20. 


2801 


'99 


280 


Z75134 


Can is 

f amiliaris 


rod transducin 


1816 


100 


281 


Z75134 


Canis 

f amiliaris 


rod transducin 


1718 


96 


282 


AF249873 


Homo sapiens 


muscle -specific protein 


1395 


100 


283 


AL05O007 


Homo sapiens 


hypothetical protein 


405 


98 


284 


AF201931 


Homo sapiens 


DC1 


1859 


99 


2B5 


AF156102 


Homo sapiens 


ELL complex EAP30 subufiit 


1318 


99 


286 


Y35897 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
146. 


1250 


99 


287 


U89964 


Homo sapiens 


HEM4S 


923 


100 


288 


AL050143 


Homo sapiens 


hypothetical protein 


598 


100 


289 


AJ011098 


Homo sapiens 


telethonin 


574 


100 


290 


Y66724 


Homo 
sapiens 


Membrane-bound protein 
PR083 6. 


2321 


100 


291 


AF034801 


Homo sapiens 


Iiprin-alpha4 


2565 


98 


292 


AF034801 


Homo sapiens 


liprin-alpha4 


2590 


100 


293 


AL049851 


Homo sapiens 


dJ889J22B.l {novel protein 
( isof orm 1 ) ) 


1738 


100 


294 


Y73348 


Homo sapiens 


HTRM clone 83 9651 protein 
sequence . 


1245 


99 


295 


L11672 


Homo sapiens 


zinc finger protein 


1694 


44 


296 


AL035423 


Homo sapiens 


dJ20I3.1 (brain mitochondrial 
carrier protein- 1 (BMCP1) ) 


1024 


79 


297 


AF198532 


Homo sapiens 


lymphoid enhancer binding 
f actor-1 


2173 


100 


298 


AF161417 


Homo sapiens 


HSPC299 


1147 


85 


299 


AF159141 


Homo sapiens 


breast cancer metastasis- 
suppressor 1 


1236 


99 


300 


U26397 ■ 


Rattus 
norvegicus 


inositol polyphosphate 4- 
phosphatase 


160 


30 


301 


AF036145 


Homo sapiens 


meningioma -expressed antigen 
5 


3458 


100 


302 


Z82022 


Homo sapiens 


GlcNac-l-P transferase 


2067 


99 


303 


AF269232 


Mus musculue 


butyrophilin-like protein 
BUTR-1 


271 


50 


304 


AJ222644 


Arabidopsis 
thaliana 


asparaginyl-tRNA synthetase 


659 


50 


305 


AF05418O 


Homo 
sapiens 


hematopoietic cell derived 
zinc finger protein 


351 


79 


306 


AJ272079 


Homo sapiens 


APOBEC-l stimulating protein 


3056 


100 


308 


Y44486 


Homo 
sapiens 


Human GPRW receptor 
polypeptide . 


1721 


100 


309 


AJ131891 


Homo sapiens 


DNA polymerase mu 


2598 


100 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
SCORE 


% 

TDPNTTTY 

1UCIN 1X11 


310 


AF29333 5 


Homo sapiens 


p3 0 DBC 


1248 


92 


311 


AF176525 


Mus musculus 


F-box protein FBL12 


1501 


93 


312 


X57802 


Homo sapiens 


immunoglobulin lambda light 
chain 


959 . 


81 


313 


Z36715 


Homo sapiens 


Net 


2048 


98 


314 


AF161532 


Homo sapiens 


HSPC04 7 


727 


100 


315 


AF20806B 


Homo sapiens 


kelch-like protein KLHL3a 


3046 


10 0 


316 


Y66€66 


Homo 


Membrane -bound protein 

PRO103 3 


1166 


100 


317 


Y29666 


Homo sapiens 


Human Ras protein RAPR-l. 


1253 


98 


318 












319 


AF161362 


Homo sapiens 


HSPC099 


224 


40 


320 


Y68773 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-5. 


2243 


99 


321 




Homo sapiens 


putative THl protein 


3013 


100 


322 


AB040812 


Homo sapiens 


protein kinase PAK5 


3792 


99 


323 


Y95013 


Homo sapiens 


Human secreted protein 
vc48_l, SEQ ID NO: 66. 


913 


100 


324 


Y13381 


Homo sapiens 


Amino acid sequence of 
protein PR02 71, 


1976 


100 


32B 


Y94 944 


Homo sapiens 


Human secreted protein clone 
bfl57__l6 protein sequence 
SEQ ID NO: 94. 


2305 


98 


326 


Y76884 


Homo sapiens 


Retinoblastoma binding 
protein- 7seguence. 


6728 


99 


327 


AF198532 


Homo sapiens 


lymphoid enhancer binding 
f actor- 1 


2173 


100 


328 


Z78013 


Caenorhabdit 
is elegans 


Similarity to Drosophila 
Cadherin- related tumor 
suppressor 


569 


33 


329 


AF212921 


Mus musculus 


MMTV receptor variant 1 


484 


94 


330 


Z75330 


Homo 

sapiens] 

>R65207 

R65207 02- 

MAR- 1995 27- 

AUG-1993 

Human 

stromalin-1 . 

[Homo 

sapiens 


nuclear protein SA-1 


6492 


99 




AbUUobo J 


Homo sapiens 


au327oib.3 (supported by 
GENS CAN, FGENES and GENEWISE) 


2133 


99 




v*a ci 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
4 8 9. 


310 


41 


333 


AJ271669 


Homo sapiens 


putative sialoglycoprotease 


1747 


100 


334 


AF156598 


Mus musculus 


p53 -regulated DDA3 


997 


64 




Moon 

nzy u do 


E imeria 
maxima 


emlOO gene is homologous the 
Eimeria tenella gene etlOO 


154 


26 




VQC CCA 


Homo sapiens 


Human nomoiogue or UNC-53 
<Hs-UNC-53/l) sequence. 


3386 


97 


JJ / 




Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/i ) sequence. 


2602 


94 


33B 


Y85564 


Homo qanipnt! 


(Hs-UNC-53/l) sequence. 


344 7 




339 


Z66561 


Caenorhabdit 
is elegans 


Similarity to Human rabl3 
protein (PIR Acc . No. 
A49647) . 


716 


34 


340 


AB021643 


Homo 
sapiens 


gonadotropin inducible 
transcription represeor-3 


2761 


99 


341 


G01946 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6027. 


465 


98 


342 


AF020591 


Homo sapiens 


zinc finger protein 


1091 


48 


343 


L29154 


Homo sapiens 


immunoglobulin heavy chain 


439 


84 
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SEQ 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 


% 


ID 


NUMBER 






WATERMAN 


IDENTITY 


NO: 








SCORE 










VDJ region 






344 


U10281 


Sus scrofa 


gastric mucin 


279 


24 


345 


AK000404 


Homo sapiens 


unnamed protein product 


1177 


99 


346 


L22557 


Rattus 
norvegicus 


calmodulin-binding protein 


1949 


84 


347 


L22557 


RaCtus 
norvegicus 


calmodulin-binding protein 


2363 


91 


348 


AL049481 


Arabidopsis 
thaliana 


AIGl-like protein 


316 


30 


350 


AJ251516 


Mus musculus 


cysteine and histidine-rich 
protein 


1460 


99 


351 


AK024477 


Homo sapiens 


FLJ00070 protein 


1773 


100 


352 


U50133 


Homo sapiens 


ankyrin 


502 


33 


353 


AK000625 


Homo sapiens 


unnamed protein product 


721 


100 


354 


AF161420 


Homo sapiens 


HSPC302 


2623 


97 


355 


AJ010014 


Homo sapiens 


M96A protein 


1269 


47 


356 


AF151029 


Homo sapiens 


HSPC195 


94 1 


Q1 

y x 


357 


AL022327 


Homo sapiens 


dJ355C18.1 {KIAA0027) 


1911 


100 


358 


W78128 


Worno can! »n q 
nuiuu Da^iciio 


cuLuueu Dy vjenc J ciunc 
HOSBI96 . 


-LJ. 1 l 


100 


359 


X03414 


Drosophi la 
relanogaster 


Kr polypeptide 


316 


45 


360 


AF151079 


Homo sapiens 


HSPC245 


643 


100 


361 


Y53886 


Homo s ap i ens 


A suppressor of cytokine 
signalling protein 
designated HSCOP-6, 


530 


41 


362 


AF254741 


Drosophila 
melanogaster 


Centaurin Gamma 1A 


681 


46 


363 


AF213465 


Homo sapiens 


dual oxidase 


2016 


100 


364 


AF181562 


•Homo sapiens 


proSAAS 


1319 


100 


365 


AF181562 


Homo sapiens 


proSAAS 


1024 


99 


366 


U73200 


Mus musculus 


pll6Rip 


884 


82 


367 


AF263744 


Homo sapiens 


ERBIN 


4 973 


Q Q 


368 


U37501 


Mus musculus 


lamlnin al nha R pha i n 
iauu.jjj.li aipiia zj pliant 


5 B67 


72 


369 


AF043S95 


Caenorhabdit 
is elegans 


similar to the protein 

nhosDhates P* - * famil v 

yiiuoyna^co i ami. j. y 


549 


36 


370 


Y73440 


Homo sapiens 


Human secreted protein clone 
yj23 1 protein sequence SEQ 
ID NO:102. 


1484 


99 


371 


AF272833 


Homo sapiens 


misato 


2869 


97 


372 


AF198454 


Homo sapiens 


epithelial protein lost in 
neoplasm beta 


3 927 


1 00 


373 


Y73345 


Homo sapiens 


HTRM clone 438283 protein 
sequence . 


273 


80 


374 


AF169017 


Homo sapiens 


formiminotransf erase 
cyclodeaminase 


2717 


98 


375" 


A95106 


unidentified 


RED ALPHA 


1202 


99 


376" 


W74828 


Homo sapiens 


Human secreted protein 
encoded by gene 100 clone 
HLQAB52 . 


1012 


99 


377 


Y32131 


Homo sapiens 


Human LYST-2 protein. 


3556 


99 


378 


M14912 


Homo sapiens 


pol 


132 


86 


379 


AF090934 


Homo sapiens 


PRO0518 


382 


100 


380 


X66363 


Homo sapiens 


serine/threonine protein 
kinase 


2499 


100 


381 


Y41699 


Homo 
sapiens 


Human PR0703 protein 
sequence . 


2362 


100 


382 


AF174498 


Homo sapiens 


GR AF-1 specific protein 
phosphatase 


7008 


98 \ 


383 


U64608 


Caenorhabdit 
is elegans 


coded for by C. elegans cDNA 
ykl73cl2.5 


24* 


36 


384 


U50133 


Homo sapiens 


ankyrin 


502 


33 


385 


AJ23 852 0 


Homo sapiens 


putative transcription 
factor-like nuclear regulator 


4123 


97 



152 



WO 01/53312 



PCT/US00/34263 



TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


3B7 


AF208845 


Homo sapiens 


BM-003 


1375 


99 


389 


X57821 


Homo sapiens 


immunoglobulin lambda light 
chain 


797 


76 


390 


AF182404 


Homo sapiens 


mitochondrial uncoupling 
protein 1 


1670 


99 


391 


Y85564 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/l) sequence. 


3386 


97 


393 


AF178432 


Homo sapiens 


SH3 protein 


370 0 


inn 1 


394 


AF229928 


Drosophila 
melanogaster 


cytoplasmic protein 89BC 


16U 


(i2 


395 


AF181721 


Homo sani pnps 


RU2S 




100 


396 


Y69197 


Homo sapiens 


Amino acid sequence of a 
human beta IV- spectrin 
pro l em . 


1626 


98 


397 j U48238 


Mi i q mi i q ilnc 


^.int. Liayei protein neuro-ui 


1AQ 


60 


398 


AL390137 


Homo sapiens 


hypothetical protein 


263 


51 


399 


AF217525 


Homo sapiens 


Down syndrome cell adhesion 
molecule 


5337 


60 


400 


AL022599 


Schizosaccha 

romyces 

pombe 


WD repeat protein 


447 


27 


401 


AC004B59 


Homo sapiens 


similar to 2 -oxoglutarate 
dehydrogenase j similar to 
Q02218 (PID:gl352618) 


4176 


78 


402 


AB010266 


Mus rrrusculus 


tenascin-X 


10246 1 


62 


403 


AL133288 


numu Sapiens 


a.u o / lu i . j. \ similar to 
D. melanogaster CG5986 
protein/ 


761 


100 


4 04 


r» e n *r r* t 

Z68753 


a r-i r^"r~ Vi A V^/^ "1 f- 

is slsQsns 




888 


48 


4 05 


Z78013 


Caenorhabdit 
is elegans 


Cadherdn- related tumor 
suppressor 


soy 


o 3 


H U D 


n Tin 


Homo sapiens 


protein containing CXXC 
doma in 2 


1196 


97 


407 


Mr ijjiub 


Homo sapiens 


NY -REN- 3 6 antigen 


1168 


1UU 


408 


Y57945 


Homo sapiens 


Human transmembrane protein 
HTMPN-69. 


1538 


99 


409 


Z18361 


Ovis aries 


f- yi r> Vi Vi va 1 i'n 


1 O 


30 


410 


AF249744 


Homo sapiens 


RhoGEF 


2733 


100 


411 


AF176529 




r *■ xj\j2\. uioLcin roAiJ 


2072 


94 


412 


AF210842 


Homo sapiens 


HARP 


4880 


100 


413 


AL03165B 


numo Sapiens 


aujiuuu . / tnovei procein 
3) 


776 


98 


414 


X57398 


Homo sapiens 


pm5 protein 


6131 


99 


415 


AB029826 


Komo sap i ens 


7 _mpt- h\/l nr'nt'nnvl - Pol 

carboxylase bio tin- containing 
subunit 


2 961 




416 


U43503 


Saccharomyce 
s cerevisiae 


Lphlp 


115 


42 


417 


AL160493 


Leishmania 
major 


possible t26fl7.21 


23 9 


J 3 [ 


418 


Y08100 


Homo sapiens 


Human PR0331 protein. 


33 0 


5 9 


419 


U15131 


Homo sapiens 


p!26 


2228 


54 


420 


AF117946 


Homo sapiens 


Link guanine nucleotide 
exchange factor II 


2363 


100 


421 


AF190635 


Drosophila 
melanogaster 


ankyrin 2 


755 


30 


422 


AF302150 


Homo 
sapiens 


phosphoinositol 3 -phosphate- 
binding protein-2 


1962 


100 


423 


AL137530 


Homo sapiens 


hypothetical protein 


433 


94 


424 


X63753 


Homo sapiens 


son-a 


7269 


100 


425 


AB027249 


Homo sapiens 


MAPKK like protein kinase 


1693 


100 


426 


AF279144 


Homo sapiens 


tumor endothelial marker 7 
precursor 


1084 


55 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


427 


AF279144 


Homo sapiens 


tumor endothelial marker 7 
precursor 


1259 


56 


428 


AE0036B3 


Drosophila 
melanogaster 


CG8312 gene product 


149 


29 


425 


Y07829 


Homo sapiens 


RING finger protein 


2201 


99 


430 


AF096897 


Drosophila 
melanogaster 


pushover 


4442 


47 


431 


U41387 


Homo sapiens 


Gu protein 


4021 


99 


432 


AF023674 


Homo sapiens 


nephrocystin 


3783 


100 


433 


AF146760 


Homo 
sapiens 


septin 2-like cell division 
control protein 


2284 


100 


434 


AB006697 


Arabidopsis 
thai i ana 


cleft lip and palate 
associated transmembrane 
protein-like 


886 


42 


437 


Y94247 


Homo sapiens 


Human calcium binding protein 
hCBP. 


1704 


100 


438 


AB040672 


Homo sapiens 


UDP-GalNAc: polypeptide N- 

acetylgalactosaminyltransfera 

se 


1075 


63 


439 


AF105228 


Bos taurus 


tuftelin 


285 


33 


440 


R06463 


Homo sapiens 


Derived protein of clone 
ICA13 (ATCC 4 0553). 


3073 


99 


441 


X14971 


Mus muaculus 


alpha-adaptin (A) (AA 1-977) 


4897 


98 


442 


X53773 


Rattus 
norvegicus 


alpha -c large chain (AA 1- 
938) 


3979 


81 


443 


Y66689 


Homo 
sapiens 


Membrane -bound protein 
PR0113 6 . 


3299 


99 


444 


AC067754 


Arabidopsis 
thaliana 


unknown protein; 20348-23707 


114 


33 


445 


AF229032 


Mus musculus 


piL 


2077 


93 


446 


AF056035 


Rattus 
norvegicus 


s-nexilin 


2662 


85 


447 


AF132484 


Mus musculus 


unknown 


4 78 


51 


448 


W89024 


Homo sapiens 


Polypeptide fragment encoded 
by gene 156 . 


528 


45 


449 


AF161445 


Homo sapiens 


HSPC32 7 


1606 


100 


450 


Z68753 


Caenorhabdit 
is elegans 


ZC518.3b 


951 


49 


4S1 


W39160 


Homo sapiens 


Human partial complement 
factor H protein fragment 3 . 


155 


32 


452 


W85727 


Homo 
sapiens 


Novel protein (Clone 
BM46JL0) . 


2799 


99 


453 


Y53629 


Homo sapiens 


A bone marrow secreted 
protein designated BMS115. 


2810 


100 


454 


D87438 


Homo 
sapiens 


Similar to a C. elegans 
protein in cosmid C14H10 


4069 


100 


455 


AF240468 


Homo sapiens 


nicastrin 


3687 


100 


456 


215005 


Homo sapiens 


CENP-E 


13305 


99 


457 


M59216 


Homo 
sapiens 


gamma -aminobutyric acid 
receptor beta-1 subunit 


2477 


100 


458 


Y73467 


Homo sapiens 


Human secreted protein clone 
yd61_l protein sequence SEQ 
ID NO: 156. 


966 


100 


459 


W67824 


Homo sapiens 


Human secreted protein 
encoded by gene IB clone 
HSLFM29 . 


535 


100 


460 


AF163151 


Homo sapiens 


dentin sialophosphoprotein 
precursor 


279 


19 


461 


D87446 


Homo sapiens 


Similar to a C. elegans 
protein encoded in cosmid 
C27F2 (U40419) 


9196 


99 


462 


G04044 


Homo sapiens 


Human secreted protein, SEQ 

ID NO: 8125. 


486 


93 


463 


AC002398 


Homo sapiens 


F25965 1 


1018 


100 


464 


AF064856 


Rattus ep . 


7acomp protein + 


1845 


84 


465 


AF223408 


Homo sapiens 


B99 


3686 


99 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


I 

IDENTITY 


466 


AF223408 


Homo sapiens 


B99 


2878 


87 


467 


AF104415 


Mus musculus 


gene trap locus -13 


6336 


91 


468 


U53450 


Rattus 
norvegicus 


Jun dimerization protein 1 
JDP-1 


196 


49 


469 


AL031297 


Homo sapiens 


dJ9722 0.1 (novel gene) 


3564 


99 


470 


AF257077 


Homo sapiens 


eukaryotic translation 
initiation factor EIF2B 
subunit 3 


1274 


95 


471 


L28125 


Podospora 
anserina 


beta transducin-like protein 


284 


38 


472 


Y84903 


Homo sapiens 


A human proliferation and 
apoptosis related protein. 


2337 


100 


473 


AF144237 


Homo sapiens 


LOMP protein 


252 


44 


474 


Y71213 


Homo sapiens 


Human irritable bowel disease 
related polypeptide IMX3 9. 


838 


100 


475 


Y95006 


Homo sapiens 


Human secreted protein 
vel3_l, SEQ ID NO: 52. 


3411 


100 


476 


D38549 


Homo sapiens 


hal025 is new 


6533 


99 


477 


AF241230 


Homo sapiens 


TAK1 -binding protein 2 


3656 


100 


478 


AL031534 


Schizosaccha 

romyces 

pombe 


putative asparagine synthase 


482 


40 


479 


L28125 


Podospora 
anserina 


beta transducin-like protein 


233 


26 


480 


AF161544 


Homo sapiens 


HSPC05 9 


434 


77 


481 


AJ23B2 4 8 


Homo sapiens 


centaurin beta2 


3986 


99 


482 


Z38061 


Saccharomyce 
s cerevisiae 


mal5, stal, len: 1367, CAI : 
0.3, AM YH YEAST P08640 
GLUCOAM YLAS E SI (EC 3.2.1.3) 


295 


23 


483 


AF161381 


Homo sapiens 


HSPC263 


1404 


100 


484 


AF223468 


Homo sapiens 


AD021 protein 


1314 


100 


486 


X57527 


Homo sapiens 


alpha l(VIII) collagen 


4166 


99 


487 


Y19062 


Homo sapiens 


39k3 protein 


2475 


100 


488 


Y73373 


Homo sapiens 


HTRM clone 921803 protein 
sequence . 


555 


56 


489 


AL021918 


Homo 
sapiens 


b34I8.1 (Kruppel related Zinc 
Finger protein 184) 


4184 


100 


490 


X53773 


Rattus 
norvegicus 


alpha- c large chain (AA 1- 
938) 


4675 


97 


491 


U52426 


Homo sapiens 


GOK 


1459 


59 


492 


AL359773 


Leishmania 
major 


possible threonine synthase 


702 


45 


493 


AF226614 


Homo sapiens 


f erroportinl 


2929 


100 


494 


Z93241 


Homo sapiens 


dJ222E13.1 (novel protein 
with some similarity to 
Drosophila KRAKEN) 


513 


96 


495 


AFQ36977 


Homo sapiens 


unknown 


1812 


100 


496 


U93564 


Homo sapiens 


p4 0 


133 


45 


497 


Y91405 


Homo sapiens 


Human secreted protein 
sequence encoded by gene 2 
SEQ ID NO: 126 . 


357 


100 


498 


AF069781 


Drosophila 
melanogaster 


Bem46-liJce protein 


653 


43 


499 


Y16601 


Homo sapiens 


Human cell-cycle 
phosphoprotein CECYP-2. 


1658 


98 


500 


X70944 


Homo sapiens 


PTB-associated splicing 
factor 


3883 


100 


501 


AF027503 


Mus 

musculus 


putative membrane- associated 
guanylate kinase 1 


205 


36 


502 


AF282874 


Homo sapiens 


nectin 3; PRR3 


2856 


99 


503 


AJ249732 


Homo sapiens 


G8 protein 


669 


100 


504 


AF208861 


Homo sapiens 


BM-019 


1629 


100 


505 


L09708 


Homo sapiens 


complement component C2 


4022 


100 


507 


X662 8 5 


Mus musculus 


HC1 ORF 


115 


43 


508 


D00189 


Rattus 
norvegicus 


Na+ , K+-ATPase alpha- subunit 


5227 1 


99 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 




WATERMAN 
SCORE 


% 

IDENTITY 


509 


Y94971 


Homo sapiens 


Human secreted protein clone 
fal71_l protein sequence SEQ 
ID No7l4 8. 


2176 


100 


510 


AB019038 


Homo sapiens 


beta-1,4 mannosyl transferase 


781 


77 


511 


AB019038 


Homo sapiens 


beta-1,4 mannosyl transferase 


1347 


100 


512 


AB019038 


Homo sapiens 


beta-1,4 mannosyl transferase 


1520 


99 


513 


XB4908 


Homo sapiens 


phosphorylase kinase 


5729 


99 


514 


XS2851 


Homo sapiens 


Dent idvlr>rol vl i <zr>m^t-n 


650 


f o 


515 


AF186084 


Homo 
sapiens 


CDidermal ornwth f ant- nr 

repeat containing protein 


JwiO 


99 


516 


G03602 


Homo <;anienF5 


Human Qprr^hpH nrnhoi n cra 
ID NO: 7683. 




99 


517 


U04706 


Bos taurue 


50 kDa n:roh ^-in 




77 


518 


G00653 


Homo sapiens 


Human secreted protein, SEQ 

Tn NO* A,T\& 


530 


100 


519 


AF161475 


Homo sapiens 


HSPC126 


1368 


100 


520 


Y99366 


tiCHno sapxens 


Human fKui4 /b lUNQ^b; ammo 
acid sequence SEQ ID NO: 88. 


33 94 


97 


521 




JlOfliu oapicno 


¥ l r Lu\ 


1295 


100 


522 


AE000995 


Archaeoglobu 
s fulgidus 


chromosome segregation 
protein (smcl) 


153 


20 


523 


AF062249 


Homo sapiens 


immunoglobulin heavy chain 
variable region 


60S 


97 


524 


AJ223830 


Rattus 
norvey icus 


ARE1 


2950 


98 


525 


W01535 


Homo sapiens 


Cellular homologue of the 
SV4 0 large T antigen. 


1276 


83 


526 


AF145658 


Drosophila 
me lanogas ter 


BcDNA.GH1022 9 


320 


33 


527 


AF112213 


Homo sapiens 


putative Rab5-interacting 
protein 


524 


79 




ui. y j o / 


Homo 
sapiens 


NADJP dependent leukotriene b4 
1 2 - hy droxydehy drogena se 


1616 


100 


529 


ijuoiy 


Homo sapiens 


Human secreted protein 
encoded from gene 9 . 


328 


32 




ZIT /17QT5C 


Homo sapiens 


dJ132F21.3 (72.1 KDa protein 
(DKFZP564A03 2, SBBI88) 
similar to mouse IFN-gamma 
induce MGll. J 


1059 


99 


53 1 


Y91506 


Homo sapiens 


Human secreted protein 
sequence encoQca Dy gene bo 
SEO ID NO -179 


1159 


98 


532 


X76116 


is elegans 




576 


50 


533 


X76116 


Caenorhabdit 
is elegans 


carrier protein (c2) 


506 


du 


534 


X12966 


Homo sapiens 


3 -oxoacyl - CoA thiolase 
propeptide (424 AA) 


1972 




535 


Y09267 


Homo sapiens 


flavin- containing 
monooxygenase 2 


2486" 


100 


536 


Z11773 


Eomo sapiens 


SRE-ZBP 


2201 


99 


537 


D84224 


Homo sapiens 


methionyl tRNA synthetase 


4741 


99 


538 


D84224 


Homo sapiens 


methionyl tRNA synthetase 


3887 


99 


539 


D84224 


Homo sapiens 






96 


540 


D84224 


Homo sapiens 


methionyl tRNA synthetase 


4529 


99 


541 


J03244 


Bos taurus 


H+ ATPase 31kDa subunit (EC 
3.6.1.3) 


84 8 


77 


542 


Y92514 


Homo sapiens 


Human OXRE-ll. 


2301 


99 


543 


AF221712 


Homo 
sapiens 


Smad- and Olf -interacting 
zinc finger protein 


2151 


61 


544 


AE000919 


Methanobacte 
rium 

thermoautotr 
op hi cum 


conserved protein 


207 


38 j 


545 


A06669 


synthetic 
construct 


preTGF-betal 


2070 


99 
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TABLE 2 



S3Q 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 

UZVTirpviiW 
vin x Cit\ru\vt 

SCORE 


% 

1 JJaiN 111 i 


546 


Y02698 


Homo sapiens 


Human secreted protein 
encoded by gene 49 clone 
HTPCS60 . 


854 


98 


547 


AF112205 


Homo sapiens 


WSB-1 protein 


2275 


100 


548 


X60271 


Mus musculus 


c-rel 


2264 


74 


549 


AC016827 


Arabidopsis 
thaliana 


putative GTPase 


810 


42 


550 


Y70400 


Homo 
sapiens 


Human cell- signalling 
protein- 2 . 


429 


68 


551 


AB048365 


Homo sapiens 


NEDD4-like ubimiitin TicraRP i 


8290 


QQ 


552 


Y57880 


Homo sapiens 


Human transmembrane protein 
HTMPN-4 . 


1112 


qc: 
y D 


553 


AF119855 


Homo sapiens 


PR01847 


265 


67 


554 


1 J Jl / iJD 




MI-TP UT . ft _ T"W"l al r->V> = nrofitvnnv 

nxiu nurt-jL-v axpna precursor 


1332 


100 


555 


AT. 078468 


rix. CHJ JL Uuy o Xo 

thaliana 


r\> it- a f i Tro nrnhoin 
pULdU-LVC pXULclIl 


b40 


40 


556 




Homo sapiens 


similar to Kelch proteins; 
similar to BAA77027 

( Pin -cr4650fl44 ) 


515 


44 


557 


AK024487 


Homo sapiens 


FLJ00086 protein 


1623 


98 


558 


M12140 


Unmn cani arte 
nutiiu sapiens 


poj. gene protein; axx 


117 


4 8 


559 


W74825 


Homo sapiens 


Human secreted protein 
encoded by gene 97 clone 


225 


56 


560 


-A. _J U □ U ± 


Unmn oam one 


junD protein 


373 


88 


561 


AFOfm ^ 6 

-fTJT L/ U J X J D 


LaCIlUl ilCtjJQ ± L 


conLains weajc siniiiaricy co 

Oil AVfJ If — U -LJ.1U -Lily HICJ U J. i. 


292 6 


54 


562 


AL1D983 9 


Homo s apxens 


dJIOfiQP? 3 1 fnnvpl PAPPn 
(■nolvfA) — biriHincr nrnh^i 


877 


100 


563 


AF181640 


D ro s oph i 1 a 
melanogaster 


BcDNA. GH09817 


2 89 


A 1 


564 


AF052723 


Feline 

leukemia 

virus 


gag-pol precursor polyprotein 
gPr8 0 


1547 


43 


565 


AF161472 


Homo sapiens 


HSPC123 


439 


44 


566 


Y28817 


Homo sapiens 


pt326 4 secreted protein. 


333 8 


100 


567 


U09848 


Homo sapiens 


zinc finger protein 


1738 


100 


569 


AF155113 


Homo sapiens 


NY-REN- 55 antigen 


3603 


93 


57D 


AF15S113 


Homo sapiens 


NY-REN- 55 antigen 


3951 


99 


571 


AL032821 


Homo sapiens 


dJ55C23.1 (vanin 1) 


1821 




572 


M69181 


Homo sapiens 


non-muscle myosin B 


7350 




573 


M69181 


Homo sapiens 


non-muscle myosin B 


7311 


98 


574 


Y59678 


Home sapiens 


Secrpf ^fi nrofpin 1 Oft - fiOfl — fi - 
■~> ^ <w x. c c w pj.uiej.n x w o uuo 3 u 

E6-FL . 


110 


100 


575 


AL365234 


Arab i dop sis 
thaliana 




ia o 


4 0 


576 


AL365234 


Arabidopsis 
thai iana 


putative protein 


78 8 


A (\ 1 

4U 


577 


X06745 


Homo sapiens 


DNA polymerase alpha-subuni t 
(AA 1 - 1462) 


7619 


99 


578 


AB041642 


Homo sapiens 


PAR- 6 


1342 


i nn 

1UU 


579 


D86984 


Homo sapiens 


similar to yeast adenylate 
cyclase (S56776) 


2446 


100 


580 


AF165124 


Homo sapiens 


gamma- ami nobutyric acid A 


2499 


99 


581 


W88812 


Homo sapiens 


Polypeptide fragment encoded 
by gene 58. 


2339 


99 


582 


U82319 


Homo sapiens 


novel ORF 


342 


100 


583 


P92219 


Homo sapiens 
(human) 


CR1 protein. 


11425 


99 


584 


AJ223948 


Homo sapiens 


RNA helicase 


6608 


99 


585 


Y08612 


Homo sapiens 


88JcDa nuclear pore complex 
protein 


3874 


99 


586 


Y42384 


Homo 
sapiens 


Amino acid sequence of 
Iv3l0_7. 


1007 


37 


587 


AF129756 


Homo sapiens 


BAT4 


1873 


98 
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ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


588 


AF13 1775 


Homo sapiens 


Unknown 


1929 


99 


5 B 9 


A J 23 0865 


Homo sapiens 


TESS 2 


2348 


100 


591 


Z988B5 


Homo sapiens 


dJ522J7.2 (bromodomain- 
containing l (similar to 
peregrin, BR140) ) 


4167 


100 


592 


L76571 


Homo sapiens 


nuclear hormone receptor 


1355 


100 


593 


AF091622 


Homo sapiens 


PHD finger protein 3 


9054 


100 


594 


XS6807 


Homo sapiens 


desmocollin type 2a 


4443 


100 


595 


AL137802 


Homo sapiens 


dJ798A10.1 (novel protein) 


212 


55 


596 


AL022329 


Homo 
sapiens 


bK407F11.2 (adrenergic, beta, 
receptor kinase 2) 


3653 


100 


597 


AF226048 


Homo sapiens 


GL003 


2009 


99 


598 


AJ278112 


Homo 
sapiens] 
>Y49635 
Y49635 21- 
OCT-1999 15- 
APR-1998 
Human sdp3 . 5 
protein. 
[Homo 
sapiens 


putative cell cycle control 
protein 


335 


23 


599 


Y59741 


Homo sapiens 


Human normal ovarian tissue 
derived protein 18. 


1574 


99 


600 


L36531 


Homo sapiens 


integrin alpha 8 subunit 


5386 


99 


601 


Y38458 


Homo sapiens 


Human secreted protein 
encoded by gene No. 20. 


895 


100 


602 


AF2185B4 


Homo sapiens 


GGA1 


3265 


100 


603 


Y13115 


Homo sapiens 


serine/threonine protein 
kinase 


5071 


99 


604 


AL132776 


Homo sapiens 


dJ393D12.1 (KIAA0776) 


2413 


99 


605 


AL034452 


Homo sapiens 


dJ682J15.1 (novel Collagen 
triple helix repeat 
containing protein) 


1979 


100 


606 


Y14494 


Homo sapiens 


aralarl 


3465 


99 


607 


AJ001981 


Homo sapiens 


OXA1L 


2603 


100 


608 


X86098 


Homo 
sapiens 


binds directly to adenovirus 
type 5 E1A protein 


3069 


100 


610 


AF163 572 


Homo sapiens 


Forssman glycol ipid 
synthetase 


1865 


99 


611 


AF161503 


Homo sapiens 


HSPC154 


1261 


97 


612 


L41834 


Ensis minor 


nuclear protein 


345 


30 


613 


Y91954 


Homo sapiens 


Human cytoskeleton associated 
protein 9 (CYSKP-9) . 


3668 


100 


614 


AL022327 


Homo sapiens 


dJ355Cl8.1 (KIAA0027) 


361 


94 


615 


X85786 


Homo sapiens 


binding regulatory factor 


3203 


100 


616 


Y08319 


Homo sapiens 


kinesin-2 


3487 


99 


617 


D12644 


Mus musculus 


KIF2 protein 


3609 


97 


618 


U28789 


Mus musculus 


PACT 


5936 


89 


619 


Y35914 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
163 . 


1684 


99 


620 


AB046382 


Mus musculus 


testis-abundant finger 
protein 


199 


23 


621 


Y00062 


Homo sapiens 


precursor polypeptide (AA -23 
to 1120) 


3440 


99 


622 


AF068286 


Homo sapiens 


HDCMD3 8P | 


861 


100 


623 


X98248 


Homo sapiens 


sortilin 


4436 


99 


624 


X61100 


Homo sapiens 


75 kDa subunit NADH 
dehydrogenase precursor 


3734 


99 


625 


S58544 


Homo sapiens 


75 kda infertility-related 
sperm protein 


2125 


99 


626 


AF151027 


Homo sapiens 


HSPC193 


582 


93 


627 


X14968 


Homo sapiens 


RH-alpha subunit (AA 1-404) 


2079 


100 


628 


Y50911 


Homo sapiens 


Human fetal brain cDNA clone 
vb7_l derived protein 


1983 


100 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


629 


Y50911 


Homo sapiens 


Human fetal brain cDNA clone 
vb7_l derived protein 


1694 


100 


630 


AF098786 


Homo 
sapiens 


17 beta-hydroxysteroid 
dehydrogenase type VII 


1754 


100 


631 


AL034555 


Homo 
sapiens 


dJ134019.3 (zinc finger 
protein 151 (pHZ-67)} 


4273 


100 


632 


W74826 


Homo sapiens 


Human secreted protein 
encoded by gene 98 clone 
HAQBT94 . 


794 


96 


633 


AF288288 


Homo sapiens 


HPT protein 


2236 


. 100 


634 


AF041429 


Homo sapiens 


pRGRl 


823 


99 


635 


X66357 


Homo sapiens 


serine/threonine protein 
kinase 


1589 


100 


636 


Y11284 


Homo sapiens 


AFX1 


2571 


98 


637 


A8004884 


Homo sapiens 


PKU-alpha 


3718 


99 


638 


AJ002303 


Homo sapiens 


synaptogyrin lc 


1020 


100 


639 


AJ002304 


Homo sapiens 


synaptogyrin lb 


1002 


100 


640 


AJ002303 


Homo sapiens 


synaptogyrin lc 


933 


94 


641 


D87682 


Homo sapiens 


similar to a C.elegans 
protein encoded .in cosmid 
T26A5 . 


2676 


100 


642 


M14660 


Homo sapiens 


ISG-K54 


2473 


99 


643 


X06661 


Homo sapiens 


calbindin (AA 1-261) 


1358 


100 


644 


AF119900 


Homo sapiens 


PR02822 


185 


76 


645 


AB031048 


Drosophila 
melanogaster 


microtubule associated- 
protein orbit 


738 


27 


646 


AF250842 


Drosophila 
melanogaster 


multiple asters 


834 


29 


647 


X86691 


Homo sapiens 


Mi-2 protein 


10110 


99 


648 


U67934 


Homo sapiens 


44.9 kDa protein C18B11 
homo log 


827 


96 


649 


AF236061 


Oryctolagus 
cuniculus 


RING-finger binding protein 


3830 


91 


650 


AL034553 


Homo sapiens 


dJ914P20.2 (KIAA0784 protein 
similar to Mus musculus 
activity- dependent 
neuroprotective protein 
(Adnp) ) 


5708 


100 


653 


X14766 


Homo sapiens 


GABA-A receptor alpha 1 
subunit 


2388 


99 


654 


AC004614 


Homo sapiens 


similar to f-spondin proteins 
AB006086 (PID:g2529225) 


3026 


99 


655 


Y57908 


Homo sapiens 


Human transmembrane protein 
HTMPN-32 . 


608 


99 


656 


Z34975 


Homo sapiens 


ldlCp 


3733 


100 


658 


AL050306 


Homo sapiens 


dJ475B7.2 (novel protein) 


1942 


99 


659 


W76734 


Homo 
sapiens 


Human mDia Rho targeting 
protein. 


781 


34 


660 


AF202724 


Homo sapiens 


Sadl unc-84 domain protein 1 


2172 


100 


661 


Z21966 


Homo sapiens 


mPOU homeobox protein 


1529 


100 


662 


AJ242954 


Mus musculus 


dysf erlin 


4752 


59 


663 


AF182316 


Homo sapiens 


myoferlin 


6232 


99 


6"6-5 


AL161516 


Arabidopsis 
thaliana 


hypothetical protein 


209 


30 


667 


X59303 


Homo sapiens 


valyl-tRNA synthetase 


3393 


99 


668 


Y13355 


Homo sapiens 


Amino acid sequence of 
protein PRO220 . 


3692 


100 


669 


AB010692 


Arabidopsis 
thaliana 


contains similarity to endo- 

beta-N-acetylglucosaminidase 

gene 


611 


52 


671 


X56123 


Mus musculus 


talin 


4474 


76 


672 


AB039371 


Homo sapiens 


mitochondrial ABC transporter 
3 


2902 


99 


673 


AF269223 


Homo sapiens 


TCP11 


806 


42 


674 


AF229633 


Mus musculus 


groucho-related protein 4 


4053 


99 


675 


L144 63 


Rattus 


1 transducin 


3619 


92 
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SCORE 


s 

IDENTITY 






norvegicus 








676 


AC005757 


Homo sapiens 


R32611 1 


2779 


100 


677 


S61069 


Homo sapiens 


reverse transcriptase 
homolog=pol {retroviral 
element } 


252 


65 


678 


AF271388 


Homo sapiens 


CMP-N-acetylneuraminic acid 
synthase 


2273 


100 


679 


X79066 


Homo sapiens 


ERF-1 


1783 




680 


AF118566 


Mus musculus 


hematopoietic zinc finger 
protein 


769 


50 


681 


Y51415 


Homo 
sapiens 


Human ui 1 H fvnp nVofll 

protein. 




99 


682 


AL133545 


Homo sapiens 


bA386N14.1 {novel protein 
similar to a dual specificity 
phosphatase) 


700 


68 


6 83 


Y8 6214 


noma sd.pj.ens 


Nuclear transport protein 
nijjj'ii. protein 

DCUUCJ1UC • 


5888 


99 


684 


Y94952 


Homo sapiens 


Human secreted protein clone 
LHiio^ii proccin sequence 
SEQ ID NO : 110 . 


354 


98 


685 


AL021B78 


Homo sapiens 


dJ257I20 4 ItransrrinMnri 
factor 20 (AR1) (KIAA0292) 
(isof orm 2 ) ) 


154 


CI 


686 


AE000198 


Escherichia 
coli 


orf/ hypothetical protein 


628 


100 


687 


M58378 


Homo sapiens 


synapsin I 


3730 


99 


688 


AF039697 


Homo sapiens 


antigen NY- CO- 31 


508 


98 


689 


U09355 


Qryctolagus 
cuniculus 


protein phosphatase 2A1 B 
gamma subunit 


2356 


99 


690 


AF155106 


Homo sapiens 


NY-REN-36 antigen 


265 


50 


691 


AC004774 


Homo sapiens 


Dlx-5 ~ '" 


1542 


100 


692 


X90530 


Homo sapiens 


ragB 


1926 


99 


693 


X90530 


Homo sapiens 


ragB 


1405 


99 


694 


X90530 


Homo sapiens 


ragB 


1590 


85 


695 


G01563 


Homo sapiens 


Human secreted protein, SEQ 
ID NO : 5644 . 


330 


100 


696 


AC011810 


Arab idop sis 
tha liana 


Putative methionine 
aminopeptidase 


669 




697 


AJ250425 


Rattus 
norvegicus 


Collybistin I 


2455 


98 


698 


AB037901 


Homo 
sapiens 


gene amplified in squamous 
cell carcinoma- 1 


5364 


99 


649 


Y99401 


Homo sapiens 


Human PR01327 (UNQ£fi7) amino 
acid sequence SEQ ID N0:218. 


1366 


100 


701 


AF221712 


Homo 
sapiens 


Smad- and Olf -interacting 
zinc finger protein 


6705 


100 


702 


X83573 


Homo sapiens 


ARSE 


3184 


99 


703 


AJ243274 


Homo sapiens 


AP-2ir*»r5 T3r*ot"pin 


o n*70 


QG' ' 


704 


Y71262 


Homo sapiens 


Human rhnnrfmmn^iil i L-o 

protein, Zchml . 


1 CQ*7 
1D7 r 


94 


705 


Y71262 


Homo sapiens 


Human rhnnHmtnririul ^ n - "1 ^ \r o 

protein, Zchml . 


X / Jo 


99 


706 


Y41257 


Homo sapiens 


Amino acid sequence of long 
human FAIM. 


1060 


100 


707 


AL022237 


Homo sapiens 


bK119lB2.3 (PUTATIVE novel 
Acyl Transferase similar to 
C. elegans C50D2.7) (isof orm 
1) ) 


2030 


100 


708 


AJ006266 


Homo sapiens 


AND-1 protein 


5942 


100 


709 


G01571 


Homo sapiens 


Human secreted protein, SEQ 
ID NO; 5652. 


111 


99 


710 


Y08698 


Homo sapiens 


ranbp3 


2849 


98 


711 


Y68770 


Homo sapiens 


Amino acid sequence oi a 
human phosphorylation 
effector PHSP-2 . 


754 


99 



160 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO: 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


* 

IDENTITY 


712 


U93574 






TOO 


c o 

by 


713 


AC004531 


Homo sapiens 


Gene with similaity to DEAD 
box helicases 


2715 


99 


714 


D89016 


Homo sapiens 


Neuroblastoma 


538 


48 


715 


Y92175 


Homo sapiens 


Human cardiovascular system 
associated protein tyrosine 
phosphatase 2. 


734 


98 


71b 


AL137013 


Homo sapiens 


DA311P8.3 (probable uracil 
phosphoribosyltranf erase) 


862 


100 


717 


AB035123 


Mus raus cuius 


GDI alpha/GTla alpha/GQlb 
alpha synthase 


1696 


93 


718 


Y96290 


TT n , - fc T\ A T\**\r- A 

Homo >P40254 
P40254 25- 
OCT-1984 09- 
APR-1983 
Human IgD. 
[Homo 
sapiens 


Human IGFAM-2 immunoglobulin. 


2345 


85 


719 


X07979 


Homo sapiens 


integrin beta 1 subunit 
precursor 


4347 


99 


720 


AJ224819 


Homo sapiens 


tumor suppressor 


2149 


99 


721 


Y07595 


Homo sapiens 


transcription factor TFIIK 


2373 


100 


722 


W41565 


Homo 

sapiens] 

>W41S64 

W41564 08- 

OCT-1997 05- 

APR-1996 

Human 

calpain . 

[Homo 

sapiens 


Human calpain. 


1591 


99 


723 


AF161341 


Homo sapiens 


HSPC07 8 


1097 


98 


724 


AF187318 


Homo sapiens 


F-box protein Fbx2 


1607 


100 


725 


AC006708 


Caenorhabdit 
is elegans 


contains sitnlarity to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP31 
(GB:Z72876) 


1143 


46 


726 


AC006708 


Caenorhabdit 
is elegans 


contains simlarity to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP31 
(GB: Z72876) 


988 


46 


727 


AC024818 


Caenorhabdit 
is elegans 


contains similarity to Pfam 
family PF00400 (WD domain, 
G-beta repeat), score- 8 1.8, 
Ci 53 x . 4e - Z U f N«j 


950 


44 


728 


AJ005897 


Homo sapiens 


JM5 


831 


47 


Tjq 
/ x JJ 




Homo sapiens 


Human secreted protein 
fragment encoded from gene 
27 


908 


97 


730 


G03931 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8012. 


578 


100 


731 


lip m o "7 n 


Oncorhynchus 
ma sou 


GTP -binding protein 


3 865 


76 


r J< 




Homo sapiens 


Human secreted protein 
encoded by Gene No . 8 . 


8 62 


97 


733 


G02650 


nuUlU DaU J.CJ1D 


Human secreted protein, SEQ 
ID NO: 6731. 


644 


97 


734 


AC024813 


Caenorhabdit 
is elegans 


Hypothetical protein 
Y54F10AL.a 


152 


24 


735 


AL035461 


Homo sapiens 


dJ967N21.6 {novel CDP-alcohol 
phosphatidyl transferase 
family member protein) 


1562 


98 


736 


U00033 


Caenorhabdit 
is elegans 


similar to S. cerevisiae YJU2 
protein 


605 


41 


737 


AF079O98 


Homo 
sapiens 


arginine-tRNA-protein 
transferase 1-lp; ATEl-lp 


2733 


99 



161 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE i 


IDENTITY 


738 


AJ131712 


Homo sapiens 


nucleolar RNA-helicase 


2793 


100 


739 


AJ133115 


Homo sapiens 


rSC-22-like protein 


2054 


99 


740 


X98258 


Homo sapiens 


M-phase phosphoprotein 9 


953 


100 


741 


X98258 


Homo sapiens 


M-phase phosphoprotein 9 


564 


74 


742 


U97191 


Caenorhabdit 
is elegans 


strong similarity to the YPT1 
sub- family of RAS proteins 


960 


85 


743 


X76057 


Homo sapiens 


phosphomannose isomerase 


2191 


10 0 


744 


G03209 


Homo sapiens 


Human secreted protein, SEQ 

I D NO : 7290. 


496 


98 


745 


X97064 


Homo sapiens 


Sec23 protein 


4034 


99 


746 


W93946 


Homo sapiens 


Human regulatory molecule 
HRM-2 protein. 


994 


100 


747 


Y73388 


Homo sapiens 


HTRM clone 3376404 protein 
sequence . 


1565 


99 


748 


M19529 ! 


Sus scrofa 


follistatin A 


1906 


98 


749 


AJ249457 


Trichomonas 
vaginalis 


centrin, putative 


183 


28 


750 


AC004410 


Homo sapiens 


fos39554 1 


2094 


100 


751 


AF074968 


Homo sapiens 


p47ING3 protein 


2167 


100 


752 


AF252284 


Homo sapiens 


transcription specificity 
factor Spl 


4005 


100 


753 


AB049629 


Homo sapiens 


phospholysine 

phosphohistidine inorganic 
pyrophosphate phosphatase 


1375 


99 


754 


D79205 


Homo sapiens 


ribosomal protein L3 9 


160 


77 


755 


AB008430 


Homo sapiens 


CDEP 


142 


29 


758 


L32162 


Homo sapiens 


transcription factor 


574 


80 


759 


AF037204 


Homo sapiens 


RING zinc finger protein 


295 


54 


760 


Y44250 


Homo 
sapiens 


Human cell signalling 
protein-13 . 


625 


100 


761 


AF218586 


Homo sapiens 


Cide-b 


1136 


100 


762 


U38934 


Gallus 
gallus 


histone H2A 


625 


97 


763 


AF226053 


Homo sapiens 


HSKM-B 


606 


32 


764 


X13403 


Homo sapiens 


Oct-1 protein {AA 1 - 743) 


3626 


100 


765 


D87446 


Homo sapiens 


Similar to a C. elegans 
protein encoded in cosmid 
C27F2 (U40419) 


568 


38 


766 


AL023828 


Caenorhabdit 
is elegans 


Y17G7B. 14 


200 


27 


767 


Y82777 


Homo sapiens 


Human chordin related protein 
{Clone dw665_4) . 


2551 


99 


768 


X92475 


Homo sapiens 


ITBA1 


1429 


100 


769 


Y42752 


Homo sapiens 


Human calcium binding protein 
3 (CaBP-3) . 


1426 


100 


770 


X51416 


Homo sapiens 


hormone receptor hERRl (AA 1- 
521) 


2641 


97 


771 


AJ006591 


Homo sapiens 


cysteine-rich protein 


1793 


100 


772 


A08695 


Homo sapiens 


rap 2 


935 


10 0 


773 


Z12173 


Homo sapiens 


N-acetylglucosamine- 6- 
sulphatase 


2970 


100 


774 


Y91950 


Homo sapiens 


Human cytoekeleton associated 
protein 5 (CYSKP-5) . 


565 


43 


776 


AL023799 


Homo sapiens 


dJ322P7.l {zinc finger) 


855 


56 


777 


AL023799 


Homo sapiens 


dJ322P7.1 (zinc finger) 


855 


56 


778 


G01880 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5961. 


84 9 


Q Q 


779 


AJ012590 


Homo sapiens 


glucose 1- dehydrogenase 


4155 


99 


780 


AL078582 


Homo sapiens 


CU130E4.2 (KIAA0796) 


1321 


68 


781 


Z75955 


Caenorhabdit 
iB elegans 


similar to mitochondrial 
carrier protein 


384 


34 


782 


AL109965 


Homo 
sapiens 


dJ1121G12.2 (SCAN domain- 
containing 1 protein) 


900 


100 


783 


AF061262 


Kus 

musculus 


semaF cytoplasmic domain 
associated protein 2 


1316 


83 


784 


G03873 


Homo sapiens 


Human secreted protein, SEQ 


649 


95 



162 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








ID NO: 7954 . 






785 


Y84441 


Homo sapiens 


Amino acid sequence of a 

human RNA-associated 

protein. ! 


2074 


100 


786 


Y00918 


Homo sapiens 


Human Rab protein, RABP-1, 
protein sequence . j 


1048 


99 


787 


Z97029 


Homo sapiens 


ribonuclease HI large subunit 


1548 


99 


788 


AB035384 


Homo sapiens 


SRp25 nuclear protein 


962 


94 


-789 


AF024631 


Homo sapiens 


ANG2 


2644 


100 


790 


AJ006710 


Rattus 
norvegicus 


phosphatidylinoaitol 3 -kinase 


4508 


97 


792 


V0063S 


bacteriophag 
e lambda 


reading frame ealO 


600 


100 


793 


AF049103 


Homo sapiens 


Huntingtin interacting 
protein 


819 


100 


795 


226317 


Homo sapiens 


desmoglein 2 


4810 


99 


796 


Y76884 


Homo sapiens 


Retinoblastoma binding 
protein-7sequence . 


5080 


99 


797 


U15155 


Gallus 
gallus 


tripsinogen 


372 


37 


798 


U97189 


Caenorhabdit 
is elegans 


strong similarity to thw 
P13/P14 family of kinases 


227 


28 


799 


AF112201 


•Homo sapiens 


neuronal protein NP25 


1053 


100 


800 


AF23476S 


Rattus 
norvegicus 


serine-arginine-rich splicing 
regulatory protein SRRP86 


958 


63 


801 


AF267852 


Homo sapiens 


placental protein 13 -like 
protein 


743 


99 


802 


AF208851 


Homo sapiens 


BM-009 


766 


80 


803 


Z81D97 


Caenorhabdit 
is elegans 


Similarity to Human 
retinoblastoma-binding 
protein RBAP4 6 yk662dl2.5 
comes from this gene 


152 


27 


804 


G02113 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6194 . 


496 


98 


805 


AL121S73 


Homo sapiens 


bA305P22.1 (novel protein) 


1160 


100 


806 


AC013483 


Arabidopsie 
thaliana 


putative GTPase activator 
protein 


264 


30 


807 


AC013483 


Arabidopsis 
thaliana 


putative GTPase activator 
protein 


264 


30 


808 


AB013885 


Homo sapiens 


beta-ureidopropionase 


14 94 


100 


809 


AF078842 


Homo sapiens 


HOTTL protein 


1581 


99 


810 


AF161421 


Homo sapiens 


HSPC303 


2134 


96 


811 


AF261689 


Homo sapiens 


DNA polymerase epsilon pl7 
subunit 


734 


100 


812 


Z74029 


Caenorhabdit 
is elegans 


Similarity to C. elegans 
alcohol dehydrogenase comes 
from this gene 


610 


71 


813 


Z73497 


Homo sapiens 


CU240C2.2 (Core histone 
H2A/H2B/H3/H4) 


324 


100 


814 


W87689 


Homo 
sapiens 


Human HTXFT1 9 polypeptide. 


1484 


99 


815 


X16282 


Homo 
sapiens 


zinc finger protein (217 AA) 
(1 is 2nd base in codon) 


1109 


99 


816 


Z92539 


Mycobacteriu 
m 

tuberculosis 


pth 


300 


36 


818 


AB030483 


Mus musculus 


B9 


197 


27 j 


819 


AL117555 


Homo sapiens 


hypothetical protein 


321 


94 


820 


AC005328 


Homo sapiens 


R26660_2, partial CDS 


865 


97 


821 


G03951 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8032. 


700 


99 


822 


L34807 


Musca 
domestica 


transposase 


174 


20 


823 


G02928 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7009. 


558 


78 


824 


Z99531 


Schizosaccha 


caffeine- induced death 


184 


29 



163 
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PCT/US00/34263 



TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 






romyces 
pombe 


protein 1 






825 


AJ006692 


Homo sapiens 


ultra high sulfer keratin 


693 


68 


826 


U23037 


Oryctolagus 
cuniculus 


eIF-2Bepsilon 


3406 


90 


827 


G03412 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7493. 


464 


100 


828 


Y30B27 


Homo sapiens 


Human secreted protein 
encoded from gene 17. 


113 


44 


829 


Y32199 


Homo sapiens 


Human receptor molecule (REC) 
encoded by Incyte clone 
2022379. 


1012 


100 


830 


W78279 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33. 


1264 


99 


832 


AB011542 


Homo sapiens 


MEGF9 


2C97 


100 


833 


G02639 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6720 . 


223 


70 


834 


AF119664 


Homo sapiens 


transcriptional regulator 
protein HCNGP 


1574 


100 


835 


AF119664 


Homo sapiens 


transcriptional regulator 
protein HCNGP 


1144 


89 


836 


AF119«4 


Homo sapiens 


transcriptional regulator 
protein HCNGP 


1448 


94 


837 


X12517 


Homo sapiens 


C protein (AA 1-159J 


918 


100 


838 


U32B65 


Drosophila 
melanogaster 


linotte protein 


164 


24 


839 


AF067730 


Homo sapiens 


TLS- associated protein TASR-2 


631 


56 


840 


U27831 


Homo sapiens 


striatum-enriched phosphatase 


2840 


98 


841 


AF286366 


Homo sapiens 


CamKI-like protein kinase 


1796 


100 


842 


G02309 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6390. 


278 


98 


843 


AE003615 


Drosophila 
melanogaster 


ade3 gene product 


113 


48 


844 


G01350 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5431. 


529 


100 


845 


U27838 


Mus musculus 


glycosyl -phosphatidyl - 
inositol -anchored protein 
homolog 


3305 


96 


847 


Y87788 


Homo sapiens 


Human RBP-26 protein. 


2026 


100 


848 


AF164794 


Homo sapiens 


Diff33 protein homolog 


2398 


100 


849 


U41315 


Homo sapiens 


ZNF12 7-Xp 


2458 


93 


850 


AF192784 


Homo sapiens 


makorin 1 


2062 


97 


851 


Y58628 


Homo sapiens 


Protein regulating gene 
expression PRGE-21. 


1548 


100 


852 


Z22968 


Homo sapiens 


M13 0 antigen 


6205 


100 


853 


Z22971 


Homo sapiens 


M130 antigen extracellular 
variant 


6380 


100 


854 


G03362 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7443. 


330 


96 


855 


G03362 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7443. 


203 


100 


856 


AF285118 


Homo sapiens 


CGI-203 


452 


100 


857 


AC006069 


Arabidopsis 
thaliana 


putative cleavage and 
polyadenylation specifity 
factor 


1383 


55 


85 8 


ALQ21546 


Homo sapiens 


Cytochrome C Oxidase 
Polypeptide Vla-liver 
precursor (EC 1.9.3.1) 


S93 


100 


859 


L02956 


Xenopus 
laevis 


ribonucleoprotein 


1664 


85 


860 


AF201947 


Homo sapiens 


MEK binding partner 1 


616 


100 


861 


L31783 


Mus musculus 


uridine kinase 


1266 


92 


B62 


AF161472 j 


Homo sapiens 


HSPC123 


602 


73 


863 


Z49068 


Caenorhabdit 
is elegans 


mitochondrial carrier protein 


370 


43 


864 


AF154108 


Homo sapiens 


tumor necrosis factor type l 


3559 


99 



164 
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TABLE 2 



SEQ 
ID 

WU . 


aspire C Tf\M 

NUMBER 




DESCRIPTION 


SMITH- 
WATERMAN 

SCORE 


* 

IDENTITY 








receptor associated protein 






865 


A.E0 01 ^30 


Helicobacter 
pylori J99 


putative 


230 


32 


866 


X57807 


Homo sapiens 


immunoglobulin lambda light 
chain 


699 


91 




*^.L.0"^1673 

V J X O / J 


Homo sapiens 


dJ694B14.1 (PUTATIVE novel 
KRAB box protein with 18 C2H2 
type Zinc finger domains) 


4066 


99 


B68 


Y11652 


Homo sapiens 


phosphate cyclase 


238 


100 


869 




Homo sapiens 


high-glucose- regulated 
protein 8 


3041 


99 


870 


xnmnc a ft 


Homo saoiens 


KIAA0841 protein 


3237 


99 


871 


AL031427 


Homo sapiens 


dJ167A19.1 (novel protein) 


1608 


100 


872 


AF151534 


Homo sapiens 


core histone macroH2A2.2 


1866 


100 


873 


AL021331 


Homo sapiens 


UU J COIN* J i 1 (^iUVai,J.vc \_ • 

*1 poann T7NC-93 (t^Toheitl 1 

C46F11.1) LIKE protein) 


1129 


100 


874 


X14608 


Homo sapiens 


propionyl-CoA carboxylase 


3579 


100 


875 


AL1173 34 


norno sopieno 


dvJ6 87Fll.l (novel protein 
(part of translation of cDNA 
DKFZp434N061, Em : AL11024 9 ) ) 


306 


100 


876 


X79489 


Saccharomyce 
s ccrcvisioc 




446 


35 


877 


Y53001 


Homo sapiens 


Human secreted protein clone 
dn834 1 protein sequence SEQ 
ID NO : 8 . 


811 


100 


878 


J\e 4 ol U ofk 




CHMP1 . 5 


957 


100 


879 


X79417 


Sua scrofa 


40S ribosomal protein S12 


687 


100 


8 80 


AF0Q131 / 


Saccharornyce 
s cerevisiae 




478 


28 


881 


id /2/b 


Ur\mri cam one 


Human signal peptide 
containing protein HSPP-52 
SEQ ID NO: 52 . 


2547 


100 


832 


M14036 


Homo sapiens 


Cl-inhibitor 


598 


77 


883 


ATI flAI OCT 




calcium- independent 
phospholipase A2 


2903 


100 


884 


AF020313 


Mus musculus 


proline-rich protein 48 


999 


84 


885 


Yl 0 936* 


Unm^ pani one 
rlOITlO Eapicllb 


hypothetical protein 


1104 


99 


886 


AF073997 


Mus musculus 


myotubularin related protein 
1 


866 


36 


887 


Y57893 


rlOTTiO Sap X cub 


UiimAn t" ranfimpmhranp Tsrotein 
HTMPN-17. 


1099 


94 


888 


AL117635 


Homo sapiens 




929 


99 


889 


Ar zlUil / 


xiorno acipiciio 


f m r»i 1 4 t a t i ve alucose 
transporter family member 
GLUT 9 


2046 


99 


890 


Y36031 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
416. 


583 


100 


891 


Y36031 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
416 . 


192 


57 


892 




riomo bcipxeiia 


ubiquitous tropomodulin U- 
Tmod 


1798 


100 


893 


AF090929 


Homo sapiens 


PR00477p 


653 


99 


894 


AL031228 


Homo sapiens 


dJ1033B10.2 (WD40 protein 
BING4 (similar to S. 
cerevisiae YER082C, M. sexta 
MNG10 and C. elegans F2 8D1.1) 


3196 


100 


895 


AL031228 


Homo sapiens 


dJ1033B10.2 (WD40 protein 
BINQ4 (similar to S. 
cerevisiae YER082C, M. sexta 
MNG10 and C- elegans F2 8D1.D 


2825 


96 


096 


AF171102 


Homo sapiens 


retinal degeneration B beta 


1302 


95 


897 


AE0035S1 


Drosophila 
melanogaster 


CGI 8176 gene product 


633 


33 
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SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


898 


AJ237946 


Homo sapiens 


DEAD Box Protein S 


2443 


100 


B99 


Z97184 


Homo sapiens 


HKE2 


624 


100 


900 


Z97184 


Homo sapiens 


HKE2 


409 


98 


901 


AJ245587 


Homo sapiens 


Kruppel-type zinc finger 


1942 


100 


902 


AF091034 


Homo sapiens 


GTP-binding protein RAB22A 


1011 


100 


903 


R95953 


Homo sapiens 


Eukaryotic cell growth 
inhibiting factor. 


414 


96 


904 


L04733 


Homo sapiens 


kinesin light chain 


1936 


72 


905 


AE003540 


Drosophila 
melanogaster 


CG109B4 gene product 


446 


33 


906 


M55542 


Homo sapiens 


guanylate binding protein 
isoform I 


2993 


98 


907 


M55542 


Homo sapiens 


guanylate binding protein 
isoform I 


2901 


96 


908 


W84085 


Homo sapiens 


Human membrane fusion protein 
WDProl . 


1889 


100 


909 


AF168676 


Homo 
sapiens 


TNF intracellular domain- 
interacting protein 


647 


100 


910 


AB029150 


Homo sapiens 


KRAB zinc finger protein 
HFB101L 


2196 


100 


911 


G02871 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 69S2. 


521 


100 


912 


G03162 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7243 . 


387 


87 


913 


AJ243 721 


Homo 
sapiens) 
>Y92508 
Y92508 13- 
APR-2000 06- 
OCT-1998 
Human OXRE- 
5 . [Homo 
sapiens 


dTDP - 4 - keto - 6 - deoxy-D- glucose 
4-reductase 


1710 


100 


914 


U24189 


Caenorhabdit 
is elegans 


hypothetical protein 1207-1; 
Method: conceptual 
translation supplied by 
authors 


244 


41 


915 


Y02591 


Homo sapiens 


A human progesterone receptor 
complex p23-like protein. 


843 


99 


916 


AE000984 


Archaeoglobu 
s fulgidus 


dinitrogenase reductase 
activating glycohydrolase 
(draG) 


171 


26 


918 


M23159 


Cricetus 
cricetus 


DHFR-coamplif ied protein 


163 


30 


919 


L1201B 


Caenorhabdit 
is elegans 


putative 


1232 


41 


920 


AF102177 


Homo sapiens 


tumor antigen SLP-8p 


1260 


97 


921 


AL096712 


Homo sapiens 


dJ744I24.2 {similar to a 
novel human gene mapping to 
Activator) 


1017 


78 


922 


AL161495 


Arabidopsis 
thaliana 


putative WD-repeat protein 


866 


42 


923 


AL161495 


Arabidopsis 
thaliana 


putative WD-repeat protein 


442 


36 


924 


U97001 


Caenorhabdit 
is elegans 


similar to 

Schizosaccharomyces pombe 


605 


51 


925 


X71978 


Mus mus cuius 


Fi£ 


1503 


95 


926 


M92288 


Drosophila 
melanogaster 


beta-spectrin 


290 


51 


927 


Y27575 


Homo sapiens 


Human secreted protein 
encoded by gene No . 9 . 


1392 


100 


928 


Y22499 


Homo sapiens 


Human secreted protein 
sequence clone mh703_l. 


2249 


100 


930 


AJ224326 


Homo sapiens 


ribulose-5-phosphate- 
epimerase 


912 


100 


931 


U28991 


Caenorhabdit 


coded for by C. elegans cDNA 


660 


55 
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ID 
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ACCESSION 
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SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 






is elegans 


cm21c7 






932 


AL080065 


Homo sapiens 


hypothetical protein 


210 


25 


933 


G01884 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5965. 


767 


98 


934 


AJ276485 


Homo sapiens 


integral membrane transporter 
protein 


1200 


100 


935 


AL035681 


Homo sapiens 


dJ756G23.3 (novel protein 
similar to drosophila 
transcriptional repressor} 


1142 ' 


80 


936 


AB026808 


Mus mus cuius 


synaptotagmin XI 


2142 


95 


93 7 




Homo sapiens 


HRIHFB2216 


2601 


99 


93 8 


X65724 


Homo sapiens 


0RF2 


498 


10 0 


939 


W89024 


Homo sapiens 


Polypeptide fragment encoded 
by gene 15 6. 


1487 


100 


940 


G04047 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8128 . 


117 


100 


941 


At u?4 joj 


U nmn oani puc 


putative HIV-1 infection 
related protein 


452 


100 


QZ. O 




Caenorhaodi t 
is elegans 


contains similarity to 
several zinc finger proteins 
but not to the zinc finger 
domains 


350 


69 


943 


AF129756 


Homo sapiens 


G5c 


273 


100 


944 


K23765 


Rat tus 
norvegicus 


a 1 pha - 1 r opomy os i n 


133 


96 


94 5 


AC009917 

/^\* V V J- f 


Arabidops is 
t ha liana 


Contains similarity to 


583 


47 


946 


AF223468 


Homo sapiens 


AD021 protein 


551 


44 


947 


AF0S5473 


Homo sapiens 


GAGE -8 


273 


51 


94 8 


X75756 


Urimo eayi^ Ana 

Jrtoiuo sapxcns 


nrot"#i n kina^^ C mil 

pj,ULClll \-* ill*-* 


2019 


68 


949 


AF1439S6 


Mus mus cuius 


coronin-2 


2300 


93 


95 0 


I J O UJ 


Homo 

a a.±J *■ cud 


Human PG1 protein sequence . 


18*1 


99 




W4 9041 


Homo sapiens 


Human low density lipoprotein 
binding protein LBP-2. 


282 


67 


0 CO 


now itjooi 


Arahi r5oT">K i r 

thaliana 


gene id:MXC17.7~ 


203 


46 


3jj 


Y017 85 


Hono s a n i pns 


Human ubiquitin-conjugating 
enzyme >Y25341 Y25341 01-JUL- 
1999 12-AUG-1998 Human NCE-2 
protein. 


3*5 


100 


954 


AF145615 


Drosophila 
melanogaster 


BCDNA.GH03377 


823 


46 


955 


U09410 


Homo sapiens 


zinc finger protein 2NF131 


2483 


99 


956 


U09410 


Homo sapiens 


zinc finger protein ZNF131 


1853 


99 


957 


AF195623 


Homo sapiens 


cholinephosphotransf erase 1 
alpha 


2126 


99 


958 


X94917 


Drosophila 
melanogaster 


head-elevated expression in 
0.9 kb 


155 


32 


959 


U54807 


Rattus 
norvegicus 


GTP-binding protein 


1167 


97 


960 


AF058807 


Bos taurus 


GTP-binding protein rah 


606 


97 


961 


G03244 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7325 . 


471 


100 


962 


AF078850 


Homo sapiens 


steroid dehydrogenase homolog 


583 


40 


963 


AP001754 


Homo sapiens 


transient receptor potential- 
related channel 7, a novel 
putative Ca2+ channel protein 


317 


30 


964 


AL035419 


Homo sapiens 


dJ1100H13.1 (putative novel 
protein) 


1129 


100 


965 


X61381 


Rattus 
rattus 


interferon- induced protein 


202 


46 


966 


D38169 


Homo 
sapiens 


inositol 1, 4, 5-trisphosphate 
3-kinase isoenzyme 


3278 


100 


967 


AL031432 


Homo 
sapiens 


dJ4 6 5N24.2.1 (PUTATIVE novel 
protein) (isoforra 1) 


693 


100 
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SMITH- 
WATERMAN 
SCORE 


IDENTITY 


968 


U79275 


Homo sapiens 


unknown 


611 


100 


969 


AJ011306 


Homo 
sapiens 


guanine nucleotide exchange 
factor (long isofomi) 


2752 


99 


970 


AF281134 


Homo sapiens 


exosome component Rrp4 6 


1186 


100 


9 71 


U53336 


Caenorhabdit 
is elegans 


weak similarity over a short 
region to myosin heavy chain 


536 


23 


972 


AC018749 


Leishmania 
major 


L8840 .12 


589 


53 


973 


AF1885D4 


Mus mus cuius 


LNV 


544 


85 


974 


U25801 


Homo sapiens 


Taxi binding protein 


852 


98 


975 


AF049523 


Homo sapiens 

1 


hunt ingt in- interacting 
protein HYPA/FBP11 


1390 


97 


976 


AF161530 


Homo sapiens 


HSPC18 2 


1040 


100 


977 


G04020 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8101. 


626 


100 


978 


AF164797 


Homo sapiens 


ribosomal protein L17 isolog 


908 


100 


979 


U94991 


Xenopus 
laevis 


transcription factor XLMOl 


795 


97 


980 


S73775 


Homo sapiens 


calmitine; calsequestrine 


2029 


166 


981 


Y94888 


Homo 
sapiens 


Human protein clone HP01462. 


2501 


100 


982 


AJ243191 


Homo sapiens 


heat shock protein 


827 


96 


983. 


X65020 


Bos taurus 


PSST subunit of the NADH: 
ubiquinone oxidoreductase 
complex 


964 


85 


984 


AJ249207 


Rhodococcus 
sp. AD45 


putative racemase 


351 


43 


985 


Z30093 


Homo sapiens 


basic transcription factor 2, 
3 5 kD subunit 


1576 


99 


986 


AB030835 


Homo sapiens 


contains two glutamine rich 
domains, three zinc -finger 
domains, and matrin 3 
homologous domain 3 (MH3) 


4697 


99 


987 


AF227258 


Bos taurus 


RPGR-interacting protein-1 


1262 


38 


988 


AL02223B 


Homo sapiens 


dJ1042K10.2 (supported by 
GENSCAN, FGENES and GENEWISE) 


4048 


99 


989 


AL022238 


Komo sapiens 


dJ1042K10.2 (supported by 
GENSCAN, FGENES and GENEWISE) 


2321 


99 


990 


AF161426 


Homo sapiens 


HSPC308 


448 


92 | 


991 


AF161426 


Homo sapiens 


HSPC308 


448 


92 


992 


AF161426 


Homo sapiens 


HSPC308 


453 


92 


993 


AL023859 


Schizosaccha 

romyces 

pombe 


trna-splicing endonuclease 
subunit 


172 


42 


994 


AL049631 


Homo sapiens 


dJ5l3M9.1 (novel Homeobox 
domain protein) 


241 


47 


995 


AC005253 


Homo sapiens 


R26445JL 


902 


100 


996 


AF265206 


Homo sapiens 


MOG1 isoform A 


974 


100 


997 


AJ248285 


Pyrococcus 
abyssi 


sarcosine oxidase, subunit 
beta (soxB) 


195 


28 


998 


AEO03641 


Drosophila 
melanogaster 


BG:DS00941.3 gene product 


218 


58 


999 


W69343 


Homo 
sapiens 


Secreted protein of clone 
CR930_1 . 


1340 


98 


1000 


AY007135 


Homo sapiens 


similar to bovine ADP/ATP 
translocase Tl mRNA with 
GenBank Accession Number 
M24102.1 


1543 


100 


1001 


Y73381 


Homo sapiens 


HTRM clone 1877278 protein 
sequence. 


1668 


100 


1002 


AF208844 


Homo sapiens 


BM-002 


428 


100 


1003 


AE004944 


Pseudomonas 
aeruginosa 


hypothetical protein 


134 


35 


1004 


AL031431 


Homo sapiens 


dJ462023.2 (novel protein) 


2058 


100 


1005 


S45367 


Can is 

familiaris 


centractin 


1949 


100 
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SCORE 


% 

IDENTITY 


1006 


S4S367 


Can is 

f amiliaris 


centractin 


1315 


98 


1007 


AB022158 


Mus 

musculus 


chaperonin containing TCP-1 
epsilon subunit 


2649 


96 


1008 


Y76332 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 38. 


1282 


97 


1009 


AB011414 


Homo sapiens 


Kruppel-type zinc finger 
protein 


1671 


58 


1010 


Z68218 


Caenorhabdit 
is elegans 


K01H12.1 


269 


67 


1011 


AB011414 


Homo sapiens 


Kruppel-type zinc finger 
protein 


1671 


58 


1012 


Z14000 


Homo sapiens 


RING1 


2017 


100 


1013 


G02841 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6922. 


332 


93 


1014 


AF145659 


Drosophila 
melanogaster 


BcDNA . GH1 0 33 3 


1244 


52 


1015 


Y02860 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 65. 


664 


67 


1016 


Y02591 


Homo sapiens 


A human progesterone receptor 
complex p23-like protein. 


772 


97 


1017 


Y99448 


Homo sapiens 


Human PR01759 (UNQ832) amino 
acid sequence SEQ ID NO: 3 74. 


2323 


100 


101B 


X67250 


Rattus 
norvegicus 


n-chimaerin 


1710 


97 


1019 


AF183417 


Homo 
sapiens 


microtubule- associated 
proteins 1A/1B light chain 3 


631 


100 


1020 


AF164795 


Homo sapiens 


sex-regulated protein janua-a 


674 


100 


1021 


AF19062S 


Coturnix 
coturnix 


qdgl-1 


638 


96 


1022 


AL133363 


Arabidopsis 
thaliana 


putative protein 


155 


37 


1023 


AB034912 


Homo sapiens 


WD- repeat like sequence 


2483 


100 


1024 


AY007091 


Homo sapiens 


similar to Homo sapiens 
mammalian inositol 
hexakisphosphate kinase 2 
(IP6K2) mRNA with Ge 


2243 


100 


1025 


X69910 


Homo sapiens 


P63 protein 


2958 


99 


1026 


U80736 


Homo sapiens 


CAGF9 


1657 


100 


1027 


AB029333 


Halocynthia 
roretzi 


HrPET-1 


1048 


54 


1028 


AB032931 


Homo sapiens 


ubiquit in- conjugating enzyme 
isolog 


1045 


100 


1029 


G01797 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5878. 


749 


98 


1030 


G01797 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5878. 


749 


98 


1031 


AF133795 


Homo sapiens 


vacuolar sorting protein 
VPS29/PEP11 


960 


100 


1032 


AJ222968 


Mus musculus 


L-periaxin 


120 


30 


1033 


Z81317 


Schizosaccha 

romyces 

pombe 


DNA2-NAM7 helicase family 
protein 


685 


31 


1034 


Y41519 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 75. 


1321 


99 


1035 


AJ276004 


Mus musculus 


Paxneb protein 


1709 


77 


1036 


AF025459 


Caenorhabdit 
is elegans 


H14A12.3 gene product 


190 


30 


1037 


U37251 


Homo sapiens 


Description: KRAB zinc finger 
protein; this is a splicing 
supplied by author 


196 


43 


1038 


W745B0 


Homo 
sapiens 


Human membrane protein 
BA0306 . 


1921 


97 


1039 


U88173 


Caenorhabdit 
is elegans 


weak similarity to 
Arabidopsis thaliana 
ubiquitin-like protein 8 


331 


80 
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% 

IDENTITY 


1040 


AF290204 


Homo sapiens 


blood group carrier molecule 
DOK1 


1637 


99 


1041 


Y96730 


Homo 
sapiens 


PR0539, a Costal -2 homologue. 


162 


22 


1042 


AF1406 83 


Mus musculus 


F-box protein FWD2 


2397 


98 


1043 


AF151023 


Homo sapiens 


HSPC189 


1104 


100 


1044 


AF181631 


Drosophila 
melanogaster 


BCDNA.GH04929 


204 


37 


1045 


Y77985 


Homo sapiens 


Human collectin amino acid 
sequence . 


1940 


100 


1046 


AJ243972 


Homo sapiens 


6-phosphogluconolactonase 


1317 


100 


1047 


AB035863 


Homo sapiens 


ATP specific succinyl CoA 
synthetase beta subunit 
precursor 


2324 


99 


1048 


AL034550 


Hcmo sapiens 


dJ1184F4.2 (novel protein 
similar to nucleolar protein 
4 (N0L4) (NOLP)) 


981 


92 


1049 


AF163825 


Homo sapiens 


pre-B lymphocyte protein 3 


634 


100 


1050 


AF201949 


Homo sapiens 


60S ribosomal protein L30 
isolog 


868 


100 


1051 


AF190624 


Mus musculus 


mdgl-1 


236 


85 


1052 


AE003529 


Drosophila 
melanogaster 


CG6151 gene product 


160 


44 


1053 


G01191 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5272. 


646 


98 


1054 


AL162756 


Neisseria 
meningitidis 


Glu-tRNA(Gln) 

ami dot ransf erase subunit A 


682 


44 


1055 


AF181856 


Rattus 
norvegicus 


tRNA selenocysteine 
associated protein 


1525 


99 


1056 


U89649 


Chlamydomona 
s 

reinhardtii 


Mrl9,000 outer arm dynein 
light chain 


244 


34 


1057 


AF159141 


Homo sapiens 


breast cancer metastasis- 
suppressor 1 


663 


53 


1058 


AF230929 


Homo 
sapiens 


keratinocyte annexin-like 
protein pemphaxin 


1710 


99 


1059 


AJ270952 


Homo sapiens 


putative membrane protein 


1363 


100 


1060 


AF224263 


Heterodontus 
f rancisci 


HOXD8 


742 


83 


1061 


X63417 


Homo sapiens 


IRLB 


1037 


100 


1062 


AL079345 


Streptomyces 

coelicolor 

A3(2) 


hypothetical protein 


143 


27 


1063 


Y71112 


Homo sapiens 


Human Hydrolase protein-10 
(HYDRL-10) . 


2547 


100 


1064 


AF263614 


Homo sapiens 


acetyl-CoA synthetase 


3493 


99 


1065 


Y13356 


Homo sapiens 


Amino acid sequence of 
protein PR0221. 


1363 


100 


1066 


AC006153 


Homo sapiens 


similar to Aquifex aeolicus 
GTP-binding protein; similar 
to AE000771 (PID:g2984292) 


662 


98 


1067 


Y18930 


Sulf olobus 
solf ataricus 


hypothetical protein 


162 


29 


1063 


R65969 


Homo 

sapiens T98G 


Glioblastoma-derived 
polypeptide . 


887 


100 


1069 


Y07964 


Homo sapiens 


Human secreted protein 
fragment 


863 


96 


1070 


AF177476 


Rattus 
norvegicus 


CDK5 activator-binding 
protein 


1995 


86 


1071 


AF245505 


Homo sapiens 


adlican 


3109 


99 


1072 


U92794 


Mus musculus 


alpha glucosidase II, beta 
subunit 


147 


36 


1073 


G03889 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7970. 


698 


98 


1074 


U15779 


Homo sapiens 


p70 


380 


28 


1075 


Y13392 


Homo sapiens 


Amino acid sequence of 


1271 


91 
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SMITH- 




ID 


NUMBER 






WATERMAN 


IDENTITY 


NO: 








SCORE 










protein PR0328. 






1076 


AF161457 


Homo sapiens 


HSPC339 


571 


100 


1077 


Y79509 


Homo sapiens 


Human carbohydrate -associated 
protein CRBAP-5. 


2151 


98 


1078 


AF223466 


Homo sapiens 


HT015 protein 


831 


66 


1079 


AL132965 


Arabidopsis 
thaliana 


putative WD-40 repeat -protein 


286 


29 


1080 


AB024937 


Homo sapiens 


LUNX 


1284 


100 


1081 


Y14768 


Homo sapiens 


V-ATPase G-subunit like 
protein 


579 


100 


1082 


AF016416 


Caenorhabdit 
is elegans 


F2 9A7.4 gene product 


141 


31 


1083 


L13291 


Homo sapiens 


ADP-ribosylarginine hydrolase 


802 


45 


1084 


AB041541 


Mus musculus 


unnamed protein product 


151 


44 


1085 


G01922 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6003 , 


202 


97 


10B6 


AB030814 


Homo sapiens 


H-REV107 protein homolog 


833 


100 


1087 


AF151638 


Homo sapiens 


phosphatidylcholine transfer 
protein 


1142 


100 


1088 


Y84432 


Homo sapiens 


Amino acid sequence of a 
human RNA-associated 
protein. 


^783 


100 


1089 


Y94867 


Homo 
sapiens 


Human protein clone HP10563 . 


613 


100 


1090 


AK023982 


Homo sapiens 


unnamed protein product 


130 


49 


1091 


AB041586 


Mus musculus 


unnamed protein product 


1103 


81 


1092 


Y71277 


Homo sapiens 


Human Zlipo3 protein. 


606 


100 


1093 


U34973 


Mus musculus 


protein tyrosine phosphatase- 
like 


1131 


95 


1094 


Y66677 


Homo 
sapiens 


Membrane -bound protein 
PR082 8 . 


522 


56 


1095 


Y87276 


Homo sapiens 


Human signal peptide 
containing protein HSPP-53 
SEQ ID NO: 53. 


1029 


99 


1096 


Y87276 


Homo sapiens 


Human signal peptide 
containing protein HSPP-53 
SEQ ID NO: 53. 


863 


98 


1097 


AF161455 


Homo sapiens 


HSPC337 


742 


98 


1098 


U80029 


Caenorhabdit 
is elegans 


similar to thioredoxin 


242 


39 


1099 


AJ005866 


Homo sapiens 


Sqv-7-like protein 


1321 


99 


1100 


AJ005866 


Homo sapiens 


Sqv-7-like protein 


1118 


99 


1101 


AJ005866 


Homo sapiens 


Sqv-7-like protein 


891 


99 


1102 


AJ005866 


Homo sapiens 


Sqv-7-like protein 


1016 


99 


1103 


AL110244 


Homo sapiens 


hypothetical protein 


299 


31 


1104 


AF242194 


Drosophila 
melanogaster 


brakeless-B 


147 


52 


1105 


AL031010 


Homo sapiens 


dJ422F24.1 (PUTATIVE novel 
protein similar to C. elegans 
C02C2.5) 


968 


100 


1106 


U28016 


Mus musculus 


parathion hydrolase 
(phosphotriesterase) -related 
protein 


1624 


87 


1107 


AJ278150 


Homo sapiens 


putative lipid kinase 


2207 


99 


1108 


G03733 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7814 . 


495 


98 


1109 


AF217287 


Drosophila 
melanogaster 


G protein RhoBTB 


834 


54 


1110 


Y28921 


Homo 
sapiens 


Human regulatory protein 
HRGP-7. 


941 


48 


1111 


Y28921 


Homo 
sapiens 


Human regulatory protein 
HRGP-7. 


1331 


51 


1112 


AF176704 


Homo sapiens 


F-box protein FBX9 


2027 


99 


1113 


AF182076 


Homo 
sapiens 


glioma tumor suppressor 
candidate region protein 2 


2418 


100 


1114 


G04039 


Homo sapiens 


Human secreted protein, SEQ 


475 


96 
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% 

IDENTITY 








ID NO: 8120. 






1115 


AF229439 


Mus musculus 


zinc finger protein 289 


1697 


91 


1116 


L40357 


Homo sapiens 


thyroid receptor interactor 


509 


100 


1117 


L40357 


Homo sapiens 


thyroid receptor interactor 


404 


85 


1118 


A12155 


Homo sapiens 


Human X5L cDNA. 


1673 


100 


1119 


AL161542 


Arabidopsis 
thaliana 


isomerase like protein 


607 


53 


1120 


AL023754 


Homo sapiens 


dJ272L16.1 (Rat 

Ca2+/ Calmodulin dependent 

Protein Kinase LIKE protein) 


2341 


98 


1121 


Y57901 


Homo sapiens 


Human transmembrane protein 
HTMPN-25. 


321 


36 


1122 


Z14122 


Xenopus 
laevis 


XLCL2 


455 


77 


1123 


AP225418 


Homo sapiens 


lipase 


1531 


97 


1124 


Y06518 


Homo sapiens 


Zen GTPase interacting 
protein ZIP. 


3227 


100 


1125 


AL035690 \ 


Homo sapiens 


dJ202I21.1 (novel protein) 


952 


100 


1126 


AJ000217 


Homo sapiens 


CLIC2 


1286 


99 


1127 


AB030505 


Mus musculus 


UBE-lc2 


1069 


79 [ 


1128 


Y73375 


Homo sapiens 


HTRM clone 142783 8 protein 
sequence. 


874 


100 


1129 


Y78941 


Homo sapiens 


Cyclophilin-type pep t idyl 
prolyl cis/trans isomerase 
amino acid sequence. 


877 


100 


1130 


AL023553 


Homo sapiens 


dJ3 4 7H13.4 (novel protein) 


557 


100 


1131 


Y91945 


Homo sapiens 


Human chape rone protein 6 
(HCHP-6) . j 


1408 


100 


1132 


Z68197 


Schizosaccha 

romyces 

pomtae 


putative nuclear pore protein 


596 


39 


1133 


Z68197 


Schizosaccha 

romyces 

pombe 


putative nuclear pore protein 


389 


35 


1134 


AF180681 


Homo sapiens 


guanine nucleotide exchange 
factor 


3597 


100 


1135 


AF079765 


Mus musculus 


enhancer of polycomb 


264 


41 


1136 


M62419 


Mus musculus 


clathrin-associated protein 


2189 


99 


1137 


AJ006219 


Drosophila 
melanogaster 


clathrin-associated protein 


1254 


78 


1138 


Y76218 


Homo sapiens 


Human secreted protein 
encoded by gene 95 . 


440 


98 


1139 


W88104 


Homo 
sapiens 


A Rab protein designated 
HRABS-2 . 


1065 


99 


1140 


Y13401 


Homo sapiens 


Amino acid sequence of 
protein PR033 9 . 


3979 


98 


1141 


W85026 


Chimeric - 
Homo sapiens 


Green fluorescent protein- 
Zap70 fusion product . 


3309 


100 


1142 


Y13402 


Homo sapiens 


Amino acid sequence of 
protein PR0310 . 


1694 


99 


1143 


G03B75 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7956. 


660 


99 


1144 


Y12917 


Homo sapiens 


Amino acid sequence of a 
human secreted peptide. 


750 


98 


1145 


Y12917 


Homo sapiens 


Amino acid sequence of a 
human secreted peptide. 


1096 


100 


1146 


AL022157 


Homo sapiens 


SPIN (SPINDLIN HOMOLOG 
(PROTEIN DXF34) ) 


1233 


100 


1147 


AL022157 


Homo sapiens 


SPIN (SPINDLIN HOMOLOG 
(PROTEIN DXF34) ) 


1233 


100 


1148 


G02548 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6629. 


3 70 


98 


1149 


Y73338 


Homo sapiens 


HTRM clone 2019742 protein 
sequence . 


1492 


100 


1150 


W74841 


Homo sapiens 


Human secreted protein 
encoded by gene 113 clone 


228 


55 
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HEAAR60 . 






1151 


AF044201 


Rattus 
norvegicus 


neural membrane protein 35; 
NMP3 5 


1570 


92 


11S2 


AF156774 


Homo 
sapiens 


lysophosphatidic acid 
acyltransferase-gammal 


1855 


99 


1153 


AL118501 


Homo sapiens 


dJ1191N16.1 (A novel protein 
{translation of the cDNA 
DKFZp566A0946, Em :AL0 50069 ) ) 


672 


64 


1154 


AF131852 


Homo sapiens 


Unknown 


473 


100 


1155 


Y41705 


Homo 
sapiens 


Human PR0352 protein 
sequence . 


1381 


97 


1156 


G04036 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8117. 


607 


99 


1157 


AF112444 


Lupinus 
luteus 


L- asparaginase 


287 


43 


1158 


AF151848 


Homo sapiens 


CGI- 90 protein 


232 


32 


1159 


AJ272267 


Homo sapiens 


choline dehydrogenase 


2449 


100 


1160 


AB001773 


Ciona 
savignyi 


PEM-6 


196 


33 


1161 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 
SEQ ID NO: 107 . 


746 


83 


1162 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 
SEQ ID NO: 107. 


746 


83 


1163 


AF113534 


Homo sapiens 


HP1-BP74 protein 


2723 


96 


1164 


AF232226 


Danio rerio 


Deddl 


191 


41 


1165 


AL118501 


Homo sapiens 


dJ1191N16.1 (A novel protein 
(translation of the cDNA 
DKFZp566A0946, Em:AL050069) ) 


1051 


71 


1166 


AL118501 


Homo sapiens 


dJ1191Nl6.1 (A novel protein 
(translation of the cDNA 
DKFZp566A0946, Em:AL050069> ) 


945 


76 


1167 


AF187733 


Homo sapiens 


syntaphilin 


831 


42 


1168 


AB019435 


Homo sapiens 


phospholipase 


951 


55 


1169 


AF0646O4 


Homo sapiens 


KE03 protein 


324 


33 


1170 


Y01164 


Homo sapiens 


Polypeptide fragment encoded 
by gene 6 . 


1191 


100 


1171 


L03188 


Saccharomyce 
s cerevisiae 


putative 


180 


22 


1172 


AF113751 


Mus mus cuius 


nuclear pore membrane 
glycoprotein POM210 


3941 


81 


1173 


AJ245417 


Homo sapiens 


G5b protein 


794 


100 


1174 


AL022238 


Homo sapiens 


dJ1042K10.3 (novel protein) 


1285 


100 


1175 


U41278 


Caenorhabdit 
is elegans 


F33G12.3 gene product 


332 


28 


1176 


M35617 


Homo sapiens 


T-cell receptor V-alpha-J- 
alpha region 


284 


83 


1177 


AC012680 


Arabidopsis 
thaliana 


putative protein phosphatase 
2C; 55455-56414 


209 


37 


1178 


G01345 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5426. 


692 


99 


1179 


AL096767 


Homo sapiens 


dJ579N16.3 (novel protein 
similar to worm, Arabidopsis 
and pine proteins) 


1342 


100 


1180 


AF03971S 


Caenorhabdit 
is elegans 


similar Co ATF synthase B 
chain 


496 


55 


1181 


Y11710 


Homo sapiens 


collagen type XIV 


1048 


97 


1182 


X82240 


Homo 
sapiens] 
>R94974 
R94974 09- 
MAY-1996 27- 
OCT-1994 
Human TCL-1 
polypeptide . 


T cell leukemia /lymphoma 1 


617 


100 
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[Homo 
sapiens 








1183 


U42B41 


Caenorhabdit 
is elegans 


short region of weak 
similarity to collagen 


161 


33 


1185 


AJ131613 


Homo sapiens 


dicarboxylate carrier protein 


1470 


99 


1186 


L27645 


Danio rerio 


growth-associated protein 


130 


36 


1187 


Y02738 


Homo sapiens 


Human secreted protein 
encoded by gene 89 clone 
HLHFP03 . 


636 


100 


1188 


AF217544 


Xenopus 
laevis 


ornithine decarboxylase- 2 


1459 


60 


1189 


AL136307 


Homo sapiens 


dJ380B8.2 (Neuritin, a 
protein which promotes 
neurit e outgrowth) 


182 


33 


1190 


X8 9602 


Homo sapiens 


rTSbeta 


197 


100 


1191 


U32828 


Haemophilus 

influenzae 

Rd 


ribosomal protein S6 
modification protein (rimK) 


26B 


31 


1192 


AF154831 


Rattus 
norvegicus 


PV-1 


1403 


60 


1 193 


Y50926 


Homo sapiens 


Human fetal brain cDNA clone 
vcl6_l derived protein. 


918 


100 


1194 


AF026530 


Rattus 
norvegicus 


stathmin- like-protein splice 
variant RB3 * 1 


1093 


97 


1195 


U3 5244 


Rattus 
norvegicus 


vacuolar protein sorting 
homolog r-vps3 3a 


2981 


96 


1196 


Y70470 


Homo sapiens 


Human p53 target molecule, 
PRG3 protein. 


1680 


100 


1197 


AF157318 


Homo sapiens 


AD-017 protein 


912 


47 


± -L y o 


rVv li jit J 


Caenorhabdit 
is elegans 


contains similarity to S. 
pombe phosphatidyl synthase 
(GB:Z28295) 


460 


39 


1199 


AF201934 


Homo sapiens 


DC12 


1649 


88 


1200 


AL031775 


Homo sapiens 


dJ30M3.3 (novel protein 
similar to C. elegans 
Y63D3A.4) 


1902 


100 


1201 


M21103 


Ovis aries 


BIIIB4 high-sulfur keratin 


484 


82 


1202 


285986 


Homo sapiens 


dJ108K11.3 (similar to yeast 
suppressor protein SRP4 0) 


1143 


75 


1203 


U18762 


Rattus 
norvegicus 


retinol dehydrogenase type I 


890 


52 


1204 


U35730 


Mus musculus 


jerky 


2235 


76 


1205 


AB002327 


Homo sapiens 


KIAA0329 


151 


24 


1206 


AB019233 


Arabidopsis 
thaliana 


ubiguinone/menaquinone 

biosynthesis 

methyl transferase- like 


762 


56 


1207 


AL136307 


Homo sapiens 


dJ380B8.2 (Neuritin, a 
protein which promotes 
neurite outgrowth) 


742 


100 


1208 


AF207989 


Homo sapiens 


orphan G-protein coupled 
receptor 


2326 


100 


1209 


Z97630 


Homo sapiens 


dJ4 66N1.4 (novel protein 
similar to ANK3 (ankyrin 3, 
node of Ranvier (ankyrin 
G))) 


181 


44 


1210 


U21549 


Mus mueculu3 


Ac3 9/physophilin 


1280 


68 


1211 


Y27700 


Homo sapiens 


Human secreted protein 
encoded by gene No. 12. 


1267 


100 


1212 


AF117814 


Mus musculus 


odd- skipped related 1 protein 


945 


66 


1213 


AF277233 


Naegleria 
f owleri 


calcineurin B 


222 


39 


1214 


D14849 


Mus musculus 


meiosis-specif ic nuclear 
structural protein 1 


1950 


77 


1215 


G03022 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7103 . 


590 


100 


1216 


Z72510 


Caenorhabdi t 


similarity to yeast UTR3 


634 


49 
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V 
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is elegans 


protein (Swiss Prot accession 
yk€77hll.5 comes from this 
gene 






1217 


Z49703 


Saccharomyce 
s cerevisiae 


unknown 


134 


22 


1218 


AC013430 


Arabidopsis 
thaliana 


F3F9.18 


199 


29 


1219 


L10910 


Homo sapiens 


splicing factor 


1026 


71 


1220 


Z70750 


Caenorhabdit 
is elegans 


similar to vanadate 
resistance protein 
transmembranous comes from 
this gene 


965 


58 


1221 


AL163815 


Arabidopsis 
thaliana 


putative protein 


653 


61 


1222 


AF155100 


Homo sapiens 


zinc finger protein NY- REN- 21 
antigen 


2261 


100 


1223 


J05071 


Bos taurus 


GTP-binding regulatory 
protein gamma- 6 eubunit 


356 


100 


1224 


Y73364 


Homo sapiens 


HTRM clone 2765991 protein 
sequence . 


1169 


99 


1225 


AL050170 


Homo sapiens 


hypothetical protein 


714 


100 


1226 


X640C2 


Homo sapiens 


RAP74 


2661 


99 


1227 


X04085 


Homo sapiens 


catalase 


2846 


100 


1228 


AJ00562 0 


Mus musculus 


skeletal muscle-specific gene 


1416 


90 


1229 


AF045564 


Rattus 
norvegicus 


development -related protein 


1715 


93 


1230 


X97571 


Mus musculus 


HCMV- interacting protein 


479 


96 


1231 


L0B239 


Homo sapiens 


located at OATL1 


2274 


100 


1232 


AF121863 


Homo sapiens 


sorting nexin 14 


1964 


100 


1233 


AF121863 


Homo sapiens 


sorting nexin 14 


1203 


84 


1234 


AC024805 


Caenorhabdit 
is elegans 


contains similarity to 
TR:O04595 


744 


31 


1235 


AC006634 


Caenorhabdit 
is elegans 


contains similarity to 
Saccharomyces cerevisiae 
probable membrane protein 
YLR418C (GB:U20162) 


357 


33 


1236 


Y1B101 


Mus musculus 


macrophage actin-associated- 
tyros ine -phosphoryl ated 
protein 


1559 


87 


1237 


AB042646 


Homo sapiens 


TGIF2 


1224 


100 


1238 


AB026264 


Homo sapiens 


IMPACT 


1694 


100 


1239 


AB026264 


Homo sapiens 


IMPACT 


1123 


100 


1240 


G00429 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 4510. 


324 


■ 100 


1241 


Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 21. 


1363 


53 


1242 


AL035602 


Arabidopsis 
thaliana 


putative protein 


499 


28 


1243 


X76483 


Gallus 
gallus 


Yes-associated protein 
(65kDa} 


574 


48 


1244 


AF220186 


Homo sapiens 


uncharacterized hypothalamus 
protein HT012 


503 


100 


1245 


AL021453 


Homo sapiens 


dJ821D11.3 (PUTATIVE protein) 


856 


100 


1246 


AJ276003 


Homo sapiens 


GAR1 protein 


1216 


100 


1247 


Y57910 


Homo sapiens 


Human transmembrane protein 
HTMPN-34. 


1369 


98 


1248 


AC004874 


Homo sapiens 


similar to N- 

acetylgalactosaminyltransf era 
se; similar to Q07537 
(PID:gll71989) 


957 


100 


1249 


AF199597 


Homo 
sapiens 


A- type potassium channel 
modulatory protein l 


1139 


100 


1250 


Y1314B | 


Rattus 
norvegicus 


PAG608 


1350 


88 


1251 


M24852 


Rattus 
norvegicus 


neuron-specific protein PEP- 
19 


124 


46 
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1252 


AF14673 8 


Rattus 
norvegicus 


testis specific protein 


771 


83 


1253 


G02725 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6806. 


419 


97 


1254 


W44375 


Homo sapiens 


Human ubiqui tin- conjugating 
enzyme polypeptide. 


1045 


99 


1255 


AC006538 


Homo sapiens 


BC41195_1 


831 


78 


1256 


AB004316 


Bos taurus 


mitochondrial methionyl-tRNA 
trans f ormylase 


1556 


88 


1257 


Z35094 


Homo sapiens 


SURF -2 


1354 


97 


1258 


Y13362 


Homo sapiens 


Amino acid sequence of 
protein PR0214 . 


2383 


100 


1259 


AC006014 


Homo sapiens 


similar to RFP transforming | 
protein; similar to P14373 
(PID:gl32517) 


1299 


100 


1260 


AC005099 


Homo sapiens 


match to AI222572 
(NID:g3804775) 


469 


100 


1261 


V00507 


Homo sapiens 


coding sequence of DHFR (1 is 
1st base in codon) {561 is 
3rd base in codon) 


984 


100 


1262 


X15443 


Rattus sp. 


gamma-glutamyltranspeptidase 
(AA 1-568) 


697 


32 


1263 


AF173B71 


Mus musculus 


neuronal PAS 3 


977 


94 


1264 


AF178983 


Homo sapiens 


Ras-associated protein Rapl 


433 


97 


1265 


Y70473 


Homo sapiens 


Human cyclic nucleotide- 
associated protein- 1 (CNAP- 
1) . 


2785 


99 


1266 


Y41738 


Homo 
sapiens 


Human PR0541 protein 
sequence . 


1622 


100 


1267 


AF061346 


Mus musculus 


Edpl protein 


1077 


64 


1268 


U97006 


Caenorhabdit 
is elegans 


C13F10.4 gene product 


154 


23 


1269 


AF233582 


Mus musculus 


GTPase Rab3 7 


942 


95 


1270 


AF195951 


Homo sapiens 


signal recognition particle 
68 


3127 


98 


1271 


AL031177 


Homo sapiens 


dJ889M15.3 (novel protein) 


1150 


55 


1272 


AF201933 


Homo sapiens 


DC11 


650 


100 


1273 


AF201933 


Homo sapiens 


DC11 


346 


98 


1274 


AL02171O 


Arabidopsis 
t ha liana 


putative protein 


348 


49 


1275 


AC0 04449 


Homo sapiens 


R33683_3 


556 


100 


1276 


Y86295 


Homo sapiens 


Human secreted protein 
HL2AG87 , SEQ ID NO: 2 10. 


1920 


100 


1277 


Y71111 


Homo sapiens 


Human Hydrolase protein- 9 
(RYDRL-9) . 


1576 


99 


1278 


S94421 


Homo sapiens 


T cell receptor eta-exon 


478 


100 


1279 


Y66695 


Homo 
sapiens 


Membrane -bound protein 

PR01344 . 


1909 


100 


1280 


AF161380 


Homo sapiens 


HSPC262 


772 


100 


1281 


Y48610 


Homo sapiens 


Human breast tumour- 
associated protein 71. 


779 


100 


1282 


AC015446 


Arabidopsis 
thaliana 


Similar to AIG1 protein 


406 


35 


1283 


AK024432 


Komo sapiens 


FLJ00022 protein 


403 


35 


1284 


W9£l53 


Komo sapiens 


Human FADD- interacting 
protein (FIP) . 


1825 


81 


1285 


AJ001019 


Homo sapiens 


ring finger protein 


1301 


100 


1286 


AE003823 


Drosophila 
melanogaster 


CG13178 gene product 


195 


29 


1287 


AF178632 


Homo sapiens 


FEM-l-like death receptor 
binding protein 


3261 


100 


1288 


AC006033 


Homo 
sapiens 


similar to MLN 64; similar to 
138027 (PID:g2135214) 


1195 


100 


1289 


AC006033 


Homo 
sapiens 


similar to MLN 64; similar to 
138027 (PID:g21352l4) 


668 


93 


1290 


AB023811 


Homo sapiens 


TU3A 


351 


54 
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1291 


Z73424 


Caenorhabdit 
is elegans 


C44B9.1 


235 


36 


1292 


Y94B71 


Homo 
sapiens 


Human protein clone HP02551. 


1222 


100 


1293 


AF180425 


Homo sapiens 


retinoblastoma-associated 
protein RAP14 0 


489 


29 


1294 


G03B56 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7937. 


538 


99 


1295 


AF133670 


Mus musculus 


ARL-6 interacting protein-2 


367 


51 


1296 


AJ249735 


Homo sapiens 


claudin-6 


1142 


100 


1297 


X57560 


Escherichia 
coli 


pspE protein 


535 


100 


1298 


AF169284 


Homo sapiens 


LIM and cysteine- rich domains 
protein 1 


1997 


100 


1299 


U41023 


Caenorhabdit 
is elegans 


coded for by C. elegans cDNA 
yk61fl.3; coded for by C. 
ykl09h8.5 


324 


29 


1300 


AB024523 


Homo sapiens 


basic kruppel like factor 


1206 


100 


1301 


X55989 


Homo sapiens 


eosinophil cationic-related 
protein 


737 


99 


1302 


AF007151 


Homo sapiens 


unknown 


1481 


100 


1303 


X52904 


•Escherichia 
coli 


open reading frame {AA 1-65) 


359 


100 


1304 


U19577 


Escherichia 
coli 


galactonate dehydratase 


242 


93 


1305 


AF266508 


Mus musculus 


NELF protein 


1409 


97 


1306 


Y57901 


Homo sapiens 


Human transmembrane protein 
HTMPN-25. 


932 


100 


1307 


U58750 


Caenorhabdit 
is elegans 


similar to the mitochondrial 
carrier family 


365 


54 


1308 


AF044774 


Homo sapiens 


breakpoint cluster region 
protein 2 


2681 


99 


1309 


AL078593 


Homo sapiens 


dJ210Bl.l (KIAA0680) 


267 


34 


1310 


XB2693 


Homo sapiens 


E48 antigen 


620 


96 


1311 


Z82263 


Caenorhabdit 
is elegans 


C47A4.1 


283 


35 


1312 


AF131218 


Homo sapiens 


chromosome 16 open reading 
frame 5 


1493 


100 


1313 


Y41763 


Homo 
sapiens 


Human PR0938 protein 
sequence . 


1636 


100 


1314 


AF196972 


Homo sapiens 


JM24 protein 


2239 


100 


1315 


AF053356 


Homo sapiens 


insulin receptor substrate 
like protein 


228 


97 


1316 


Y66695 


Homo 
sapiens 


Membrane -bound protein 
PR01344 . 


1909 


100 


1317 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


2442 


89 


1318 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


1477 


83 


1319 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


1651 


86 


1320 


X56932 


Homo sapiens 


2 3 kD highly basic protein 


1044 


100 


1321 


AF174605 


Homo 
sapiens] 
>Y83086 
Y83086 09- 
MAR-2000 28- 
AUG-1998 F~ 
box protein 
FBP-18. 
[Homo 
sapiens 


F-box protein Fbx25 


467 


70 


1322 i 


M61732 


Trypanosoma 
cruzi 


neuraminidase 


214 


24 


1323 


Y17013 


porcine 
endogenous 


pol 


304 


64 
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NO: 


ACCESSION 
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SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






retrovirus 








1324 


AL138655 


Arabidopsis 
tha liana 


putative protein 


1174 


37 


1325 


AL138655 


Arahidopsis 
thaliana 


putative protein 


946 


35 


1326 


AL133215 


Homo sapiens 


bA108L7.2 (novel protein 
similar to rat tricarboxylate 
carrier) 


1322 


99 ?■ 


1327 


AF161541 


Homo sapiens 


HSPC056 


1357 


99 


1328 


Y73346 


Homo sapiens 


HTRM clone 619699 protein 
sequence . 


785 


96 


1329 


L10910 


Homo sapiens 


splicing factor 


912 


82 


1330 


AF146568 


Homo sapiens 


MIL1 protein 


1936 


100 


1331 


W87772 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase (H-SGK2) 
polypeptide . 


232 


39 


1332 


Y41741 


Homo 
sapiens 


Human PRO704 protein 


186C 


100 


1333 


AF295096 


Homo sapiens 


zinc-finger protein ZBRK1 


411 


91 


1334 


Z82271 


Caenorhabdit 
is elegans 


Similarity to Mouse kinensin- 
like protein KIF4 comes from 
this gene 


578 


44 


1335 


AE 0 0 0810 


Met* hannbacte 
rium 

thermoautotr 
op hi cum 


conserved Drotein 


290 


43 


13 3 6 


Y68779 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-11. 


1019 


91 


1337 


AB027003 


Mus musculus 


protein phosphatase 


378 


84 


1338 


U64856 


Caenorhabdit 
is elegans 


weak similarity to TPR 
domains 


215 


40 


1339 


AE001394 


Plasmodium 
falciparum 


protein of the YMR7 family 


170 


29 


1340 


X76717 


Homo sapiens 


MT-ll protein 


204 


89 


1341 


AC011914 


Arabidopsis 
thaliana 


putative^mutT protein; 68398- 
67881 


289 


45 


1342 


AJ276171 


Homo sapiens 


ASPIC 


2122 


100 


1343 


AF18 7016 

r\U J- O i V X O 


Homo sapiens 


myosin regulatory light chain 
interacting protein MIR 


2303 


99 


1344 


AC006963 


Homo sapiens 


similar to Kelch proteins; 
similar to BAA77027 
(PID:g4650844) 


894 


35 


1345 


AF2S7466 


Homo sapiens 


N-acetylneuraminic acid 
phosphate synthase 


1880 


99 


1346 


Y25896 


Homo sapiens 


Human secreted protein 
fragment encoded from gene 
64. 


1148 


100 


1347 


AJ272073 


Torpedo 
marmorata 


male sterility protein 2-like 
protein 


1664 


58 


1348 


AF161548 


Homo sapiens 


HSPC063 


1018 


98 


1349 


W78128 


Homo sapiens 


Human secreted protein 
encoded by gene 3 clone 
HOSBI96 . 


1117 


100 


1351 


G02144 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6225. 


418 


100 


1352 


D90869 


Escherichia 
coli 


similar to 


2047 


100 


1353 


A12029 


Homo sapiens 


MRP- 14 


613 


100 


1354 


AC005328. 


Homo sapiens 


R2666 0_l, partial CDS 


870 


74 


1355 


AC024876 


Caenorhabdit 
is elegans 


contains similarity to 
SW : RPB1_CRIGR 


829 


61 


1356 


AF077226 


Homo sapiens 


copine III 


1876 


fU 


1359 


AF217188 


Mus musculus 


YIP1B 


801 


63 


13^0 


AC074331 


Homo sapiens 


2NF234 


3869 


100 


1361 


AL163279 


Homo sapiens 


homolog to cAMP response 


5035 


99 
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DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








element binding and beta 
transducin family proteins 






L362 


24 8475 




glucokinase regulator 


3160 


99 


1363 


Z4 8475 


riOutO sapiens 


glucokinase regulator 


2682 


97 


1364 


AF195764 


Homo sapiens 


megakaryocyte-enhanced gene 
transcript 1 protein; MEGTl 
protein 


2055 


99 


1365 


AF116609 


Homo sapiens 


PRO0915 


581 


100 


1366 


AF116609 


Homo sapiens 


DDf\n QIC 


581 


100 


1367 


AL117352 


Homo sapiens 


dJ876B10.3 (novel protein 
similar to eiegans 


2581 


99 


1368 


Y34124 


Homo 
sapiens 


Human potassium channel 
^,+nnov J.3 . 


1342 


100 


1369 


AJ245621 


Homo sapiens 


protBin 


3728 


99 


1370 


AF008220 


Bacillus 
subtil is 


YtaG 


429 


45 


1371 


X05562 


Homo sapiens 


alnVia.') rhain nrpnirfinT (AA — 
axpilcl i icl n i ^iic^u^oul 

25 to 1018) (3416 is 2nd base 
in codon ) 


5908 


99 


1372 




Homo sapiens 


dJ4 08N23 . 4 (novel DnaJ domain 
protein) 


1296 


99 


13 73 


AF154415 


Homo sapiens 


FLASH 


10253 


100 


1374 


U20286 


Rattus 
norveyit-ub 


1 am-ina a Q Ror* 1 a f pri nolVDeiDt ide 

1C 


1567 


69 


13 75 


U53445 


nomo sapiens* 


DOCl 


1645 | 46 


13 76 


AL11733 / 


Homo 
sapiens 


hATQUTG 1 (zinc finaer 
protein 33a (KOX 31) ) 


250 


60 


1377 


ACQOb J^b 


XlOmO oapxcllo 


R26660 1, partial CDS 


1126 


100 


13 78 


U35113 


XlOmO Sdpicllo 


metastasis-associated gene 


1823 


69 


13 79 


L15313 


LocnuriidDliX L 

is elegans 


putatxvc 


858 


58 


1380 


Y25756 


Homo sapiens 


Human secreted protein 
encoded from gene 46. 


1508 


100 


13 81 


ABO J /jqU 


XI /-\m n cani on c 
nomo b dpi CLIO 


ATtfKHZN 


5734 


95 


1382 


AB037360 


Homo sapiens 


.TUN J\TlAll* 


959 


97 


1383 


AF23 76 7 6 


Mus musculus 


G beta-like protein GBL 


1721 


96 


13 84 


Ar23 fb /b 


Fill 5 (UUoUUluo 


G beta- like protein GBL 


1043 


70 


1385 


Y58793 


Homo sapiens 


Human calcium regulatory- 
protein CaREG-1. 


715 


100 


1386 


AF212162 


Homo sapiens 


ninem 


10369 


99 


i t an 

13 87 


AJjO J lb ob 


Unmn oanl one; 
XlOmO BcLp-Ltillo 


dJ963K23.2 (novel protein) 


337 


33 


13 88 


AC004 B 30 


Unmn eArsl Ana 

noma sapiens 


qim-ilar to zinc finaer 
proteins; similar to BAA243 8 0 
>W06316 W06316 03-OCT-1996 
27-APR-1995 TRP-1 protein. 


542 


86 


13 8 9 


7iT?T ft*7 Q R Q 
/VT J, o / y O 79 


Homo sapiens 


zinc finger protein ZNF223 


2665 


99 


13 9 0 


ALU J 


Homo sapiens 


Zinc finger protein ZNF221 


3459 


100 


1391 


AF287894 


Homo sapiens 


PIST 


1410 


97 


13 92 


TvWO ft "5 ■> c c; 


"Ho, mo, canipns 


inner centromere protein 
INCENP 


1794 


99 


13 93 


von o/n 


TJ r\mr\ cani one 
llOmO BeipiCUb 


axonal transporter of 
synaptic vesicles 


4584 


99 


1394 


AF076249 


Komo sapiens 


zinc finger protein SBBIZ1 


3208 


99 


1395 


G02224 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 63 05. 


299 


75 


1396 


AC004809 


Arabidopsie 
tha liana 


Similar to 


130 


34 


1398 


AF242519 


Homo sapiens 


zinc finger protein SBZF3 


181 


66 


1399 


AL133396 


Homo 
sapiens 


dJ1068H6.4 (prion protein 
like protein doppel) 


962 


100 


1400 


Y48611 


Homo sapiens 


Human breast tumour- 
associated protein 72. 


617 


99 


1401 


AC004472 


Homo sapiens 


PI. 11659_5 


280 


54 


1402 


X91489 


Saccharomyce 
s cerevisiae 


putative HMG box 


164 


27 
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ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1403 


Y79222 


Homo 
sapiens 


Human transferase TRNSFS-14. 


2842 


100 


1404 


X81058 


Mus musculus 


tex26l 


1010 


99 


1405 


AB012084 


Mus musculus 


ITM 


194 


29 


140^ 


AB030251 


Homo sapiens 


GTPase activating protein 


3233 


99 


1407 


AJ010585 


Rattus 
rattuc 


PTB-like protein 


2684 


99 


1408 


X75760 


Drosophila 
melanogaster 


LRR4 7 


364 


29 | 


1409 


U76618 


Mus musculus 


N-RAP 


804 


48 


1410 


AC00557B 


Homo sapiens 


F20 887_l, partial CDS 


835 


63 


1411 


AE000284 


Escherichia 
coli 


orf, hypothetical protein 


360 


100 


1412 


X01563 


Escherichia 
coli 


L5 (rplE) {aa 1-179) 


911 


100 


1413 


W78279 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33. 


1264 


99 


1414 


AB031051 


Homo sapiens 


organic anion transporter 
OATP-E 


3832 


100 


1415 


M17466 


Homo sapiens 


coagulation factor XII 


3455 


100 


1416 


AF097994 


Homo 
sapiens 


L - kynurenine /alpha - 
aminoadipate aminotransferase 


2202 


99 


1417 


AF151077 


Homo sapiens 


HSPC243 


1262 


99 


1418 


Y09945 


Rattus 

no r veg i cu s 


putative integral membrane 
transport protein 


1098 


61 


1419 


U13152 


Mesocricetus 
auratus 


guanine nucleotide-binding 
protein beta 5 


2179 


76 




TVT.T £94RQ 
J\Ll A. D Z*4 D O 


Honm ^aniens 


bA465L10.5 (KIAA1176 (novel 
protein, presumed ortholog 
of mouse K-Cl cotransporter 
KCC2 ) ) 


5696 


100 


1421 


Y99426 


Homo sapiens 


Human PRO1604 (UNQ785) amino 
acid sequence SEQ ID NO: 3 08. 


152 


29 


1422 


Y94923 


Homo sapiens 


Human secreted protein clone 
qsl4_3 protein sequence SEQ 
ID NO: 52. 


4039 


99 


1423 


AF177388 


Homo 
sapiens 


cancer-amplified 
transcriptional coactivator 
ASC-2 


10748 


99 


1424 


Y4B517 


Homo sapiens 


Human breast tumour- 
associated protein 62. 


1851 


99 


1425 


AF208848 


Homo sapiens 


BM-006 


1454 


89 


1426 


AF208B48 


Homo sapiens 


BM-006 


853 


79 


1427 


AF112886 


Bos taurus 


differentiation enhancing 
factor 1 


4693 


95 


1428 


U41387 


Homo sapiens 


Gu protein 


1372 


63 


1429 


AF161534 


Homo sapiens 


HSPC049 ' 


2853 


78 


1430 


AF125043 


Mus musculus 


bisphosphate 3 ' -nucleotidase 


275 


30 


1431 


Y66718 


Homo 
sapiens 


Membrane -bound protein 
PRO1106. 


1886 


100 


1432 


AF193613 


Homo sapiens 


cell recognition molecule 
Caspr2 


568 


100 


1433 


AB044560 


Mus musculus 


Gliacolin 


192 


34 


1434 


R99800 


Homo sapiens 


NTII-1 nerve protein, 
facilitates regeneration of 
nerve cells . 


707 


51 


1435 


AF220530 


Homo sapiens 


myo- inositol 1-phosphate 
synthase Al 


2904 


100 


1436 


X70944 


Homo sapiens 


PTB-associated splicing 
factor 


1261 


72 


1437 


AF271732 


Homo sapiens 


bridging integrator- 3 


1282 


100 


143B 


Y30811 


Homo sapiens 


Human secreted protein 
encoded from gene 1 . 


595 


98 


1439 


AJ293659 


Homo sapiens 


mucolipidin 


628 


97 


1440 


AF219138 


Homo sapiens 


GGA3 long isoform 


3083 


100 


1441 


AF21913B 


Homo sapiens 


GGA3 long isoform 


3346 


100 
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PC17US00/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1442 


AB039669 


Homo sapiens 


AliEX3 


1944 


100 


1443 


AF237711 


Drosophila 
melanogaster 


Diablo 


191 


27 


1444 


AJ011896 


Homo sapiens 


Nafl beta protein 


439 


39 


1445 


X73874 


Homo sapiens 


phosphorylase kinase 


6233 


98 


1446 


AF214114 


Homo sapiens 


breast carcinoma-associated 
antigen BCAA 


3999 


99 


1447 


AF003924 


Homo sapiens 


ANC 2H01 


2645 


99 


1448 


AF003136 


Caenorhabdit 
is elegans 


contains weak similarity to 
an AMP-binding motif 


2843 


52 


1449 


AF155112 


Homo sapiens 


NY-REN-50 antigen 


1184 


89 


1450 


Y95004 


Homo sapiens 


Human secreted protein 
vc54 1, SEQ ID NO: 46. 


985 


100 


1451 


AF107203 


Homo sapiens 


ataxin 2-binding protein 


688 


57 


1452 


AF1072 03 


Homo sapiens 


ataxin 2-binding protein 


456 


78 


1453 


23 B011 


Mus tnusculus 


DMR-N9 


882 


56 


1454 


X90568 


Homo sapiens 


Protein sequence and 
annotation available soon via 
LABEIT6EMBL- Heidelberg . DE 


510 


28 


1455 


AL035409 


Homo sapiens 


dJ564Mll.3 (similar to 
sialyl tranf erase) 


1356 


100 


1455 


D44480 


Mus tnusculus 


MATH- 2 protein 


272 


100 


1458 


AF141326 


Homo sapiens 


RNA helicase HDB/DICE1 


478 


45 


1459 


At l*kZbt>Z 


Gallus 
gallus 


re t inovin 


945 


34 


1460 


U11036 


Homo sapiens 


Ibdl 


724 


84 


1461 


AB0252bo 


Mus musculus 


granuphil in- a 


545 


39 


1462 


Y08134 


Homo sapiens 


aciu spninyoinyeiiiifiisc iikc 
phosphodiesterase 


2428 


99 


14 63 


AC004 997 


Homo sapiens 


lUdLCn LO LOlU ti*± J3 / y 

(NID:g573097) , R19699 
(NID:g774333) 


8 6 9 


98 


1464 




Homo sapiens 


mr»+-r>V* *- n P<5T« 74"}Q7Q 

1 1 lei L LU CCD X a lilj J 1 J 

(NID:g573097) , R19699 


869 


98 


1465 


U32743 


Haemophilus 

\ nf 1 npn7ap 

Rd 


fucose operon protein (fucU) 


315 


50 




vn q n o o 

X U y U 4, A 


\J r^mr~\ oani one 
XiOmO SeipiCllb 




2342 


100 


1467 


AC003034 


Homo sapiens 


Homolog of rat kidney- 
specific (KS) gene 


1072 


99 


"\ ACQ 


AT U / 1.3*11 


Spinac ia 

oj.ejcdi.eo 
( 


LiJJUlUoC 1/ JJlbpiluofJlaLC 

pa rVinYvl a cp /nYvopna cp <?mall 
subunit N-methyltransf erase I 


333 


26 


1469 


Y57930 


Homo sapiens 


Human transmembrane protein 
HTMPN-54 . 


1053 


100 


1470 


AF032666 


Rattus 
norvegicus 


rsec5 


4504 


93 


1471 


Y70467 


Homo sapiens 


Human membrane channel 
protein-17 (MECHP-17) . 


452 


74 


1472 


AL031033 


Homo sapiens 


C321D2.1 (Ribosomal Large 
Subunit Pseudouridine 
Synthase protein) 


1694 


100 


1473 


AF177292 


Homo sapiens 


genethonin 3 


4026 


98 


1474 


S45936 


Homo sapiens 


HTS1 


1101 


50 


1475 


Y86241 


Homo sapiens 


Human secreted protein 
HOABR60 , SEQ ID NO: 156. 


1B79 


98 


1476 


AJ010317 


Fugu 
rubripes 


Sand 


1278 


68 


1477 


U42831 


Caenorhabdit 
is elegans 


coded for by C. elegans cDNA 
yk99b4.3; similar to human 
transforming protein 
(PIR:S22157) 


846 


44 


1478 


X62447 


Homo sapiens 


PR 264 


543 


61 


1479 


X82209 


Homo sapiens 


MN1 


7116 


100 


1480 


U10536 


Pan paniscus 


MHC class I A 


675 


84 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1481 


AL078599 


Homo sapiens 


dJ991C6.1 {novel protein 
similar to C. elegans 
F55A12.9 <Tr:P91086) ) 


1274 


65 


1482 


Z98977 


Schizosaccha 

romyces 

pombe 


putative vacuolar protein 


256 


29 


1483 


AB005662 


Mus musculus 


JNK/SAPK-associated protein-1 


4966 


92 


1484 


AL050120 


Homo sapiens 


hypothetical protein 


716 


100 


1485 


M27878 


Homo sapiens 


DNA binding protein 


1006 


53 


1486 


Y69161 


Homo sapiens 


Amino acid sequence of a 
partial protein kinase. 


575 


99 


1487 


X84156 


Saccharomyce 
s cerevieiae 


ATH1 


341 


29 


1488 


AF03 8963 


Homo sapiens 


RNA he li case 


446 


34 


1489 


U56966 


Ca no rhabdi t 
is elegans 


coded for by C. elegans cDNA 
ylc30b3.5? coded for by C. 
elegans cDNA yk30b3.3 


620 


42 


1490 


AE000989 


Ax c ha e og 1 obu 
s fulgidus 


enoyl-CoA hydra tase (fad- 4) 


533 


46 


1491 


M80633 


Rattus 
norvegicus 


adenylyl cyclase type IV 


707 


95 


1492 


Y73342 


Homo sapiens 


HTRM clone 2709055 protein 
sequence . 


3513 


99 


1493 


Y17220 


Homo sapiens 


Human secreted protein {clone 
f j283-ll} . 


462 


37 


1494 


AF133670 


Mus musculus 


ARL-6 interacting protein-2 


701 


97 


1495 


Y94697 


Homo 
sapiens 


Human protein clone HP10574 . 


1371 


100 


1496 


AL049699 


Homo sapiens 


dJ747H23.2 (novel protein) 


1550 


100 


1497 


AF037447 


Homo sapiens 


ribosomal S6 protein kinase 


2427 


100 


1498 


AL445067 


Thermoplasma 
acidophil um 


putative target YPL207w of 
the HAP 2 transcriptional 
complex related protein 


269 


35 


1499 


AB039947 


Homo sapiens 


XllL-binding protein 51 


227 


36 


1500 


AJ277750 


Homo sapiens 


UBASH3A protein 


3509 


100 


1501 


AL050333 


Homo 
sapiens 


dJ93K22.1 (novel protein 
(contains DKFZP564B116) ) 


2439 


100 


1502 


AF179896 


Homo sapiens 


TALE homeobox protein Meis2b 


1140 


100 


1503 


AF178948 


Homo sapiens 


TALE homeobox protein Meis2a 


1177 


100 


1504 


Y53005 


Homo sapiens 


Human secreted protein clone 
pm74 9_8 protein sequence SEQ 
ID NO: 16. 


1442 


99 


1505 


X82494 


Homo sapiens 


f ibulin-2 


3580 


99 


1506 


X98296 


Homo sapiens 


ubiquitin hydrolase 


783 


42 


1507 


AL034548 


Homo sapiens 


dJ1103G7.6 (novel protein) 


1098 


100 


1508 


Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 21. 


1736 


100 


1509 


AF220182 


Homo sapiens 


uncharacterized hypothalamus 
protein HT0OB 


1181 


98 


1510 


U64601 


Caenorhabdit 
is elegans 


Gene probably begins in the 
next cosraid 


415 


58 


1511 


AL356192 


Neurospora 
crassa 


related to MDM1 protein 


196 


29 


1512 


D17629 


Homo 
sapiens 


N-acetylgalactosamine 6- 
sulfate sulfatase (GALNS) 


1829 


100 


1513 


AF168717 


Homo sapiens 


x 009 protein 


694 


99 


1514 


AJ243531 


Homo sapiens 


nM15 protein 


735 


100 


1515 


AC003672 


Arabidopsis 
thai i ana 


putative C3HC4-type RING zinc 
finger protein 


407 


30 


1516 


AF115435 


Rattus 
norvegicus 


syntaxin 17 


1374 


90 


1517 


AP003140 


Caenorhabdit 
is elegans 


C44E4.5 gene product 


274 


31 


1518 


AB002584 


Rattus 
norvegicus 


beta - alanine - pyruvate 
aminotransferase 


2238 


82 


1519 


AL121764 


Schizosaccha 


yeast atpl2 protein precursor 


270 


30 
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SMITH- 
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SCORE 


% 

IDENTITY 






romyces 
pombe 


homo log 






1520 


AF255910 


Homo 
sapiens 


vascular endothelial 
junction-associated molecule 


547 


100 


1521 


D31764 


Homo sapiens 


KIAA0064 


170 


27 


1522 


Y66634 


Homo 
sapiens 


Membrane -bound protein 
PRO190 . 


985 


100 


1523 


Y94450 


Homo sapiens 


Human inflammation associated 
protein 


250 


43 


1524 


AC0001O7 


Arabidopsxs 
thai i ana 


F17F8 . 22 


277 


37 


1525 


AF109377 


Mus musculus 


ldlBp 


1277 


83 


"■TV- ^> 

1526 


AL031427 


Homo sapiens 


dJ167A19.4 (novel protein) 


1432 


99 


1527 


Y08135 


Mus musculus 


acid sphingomyelinase -like 
phosphodiesterase 


1496 


79 


1528 


AK024423 


Homo sapiens 


FLJ0G012 protein 


611 


100 


1529 


AF154502 


Homo sapiens 


quiescent cell proline 
dipeptidase 


679 


100 


1530 


AF205598 


Homo sapiens 


transposase-like protein 


1368 


100 


1531 


AF251039 


Homo sapiens 


putative zinc finger protein 


1420 


50 


1532 


W74805 


Homo sapiens 


Human secreted protein 
encoded by gene 77 clone 
H0EAS24 . 


4 93 


57 


1533 


AF039023 


Homo sapiens 


Ran-GTP binding protein; 
RanBP6 


5707 


99 


1534 


AC00719O 


Arabidopsis 
thaliana 


F23N19.9 


374 


37 


1535 


AB027564 


Homo sapiens 


DINB1 


4482 


100 


1536 


Y36178 


Homo sapiens 


Human secreted protein 


377 


87 


1537 


Y50907 


Homo sapiens 


Human fetal brain cDNA clone 
vb3_l derived protein. 


3693 


99 


1536 


AF017368 


Mus musculus 


faciogenital dysplasia 
protein 2 


177 


47 


1539 


AF266756 


Homo sapiens 


sphingosine kinase 


2011 


99 


1540 


Z48804 


Homo sapiens 


0A1 


2238 


100 


1541 


AF000195 


Caenorhabdit 
is elegans 


Contains similarity to Pfam 
domain: PF0O169 (PH) , 
Score=20.6, E-value=l . 9e-05, 
N-l 


379 


42 


1542 


Y71159 


Homo sapiens 


Human phosphodiesterase 
interacting protein, 
myomegalin. 


9415 


99 


1543 


X76092 


Homo sapiens 


DNA binding protein RFX3 


3327 


100 


1544 


AB01533O 


Homo sapiens 


HRIHFB2 007 


631 


50 


1545 


AF198487 


Homo sapiens 


transcription factor LBP-lb 


2822 


100 


1546 


AF016417 


Caenorhabdit 
is elegans 


Similar to BZIP transcription 
factor 


518 


42 


1547 


X55885 


Homo sapiens 


KDEL receptor 


1106 


100 


1548 


AB035495 


Carassius 
auratus 


ubiquitin-activating enzyme 
El 


836 


42 


1549 


AL021707 


Homo sapiens 


dJ508I15.4 (KIAA0668) 


3688 


100 


1550 


AJ223978 


Bacillus 
subtilis 


YvqK protein 


292 


42 


1551 


AF145615 


Drosophila 
melanogaster 


BCDNA.GH03377 


822 


44 


1552 


ALilb7734 


Schizosaccha 

romyces 

pombe 


putative mannosyl transferase 
involved in N-glycosylation 


435 


37 


1553 


AF079527 


Mus musculus 


IER5 


691 


63 


1554 


AB026291 


Rattus 
norvegicus 


acetoacetyl-CoA synthetase 


1099 


88 


1555 


Y44722 


Homo sapiens 


Human immune system molecule, 
ISMO-3. 


1780 


99 


1556 


AF116553 


Drosophila 
melanogaster 


antennal-specif ic short-chain 
dehydrogenase/reductase 


277 


32 


1557 


Y71056 


Homo sapiens 


Human membrane transport 


1975 


99 
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SMITH- 
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SCORE 


% 

IDENTITY 








protein, MTRP-1. 






1558 


Y71056 


Homo sapiens 


Human membrane transport 
protein, MTRP-1. 


1975 


99 


1559 


Y71056 


Homo sapiens 


Human membrane transport 
protein, MTRP-1. 


1894 


97 


1560 


AF092050 


Mus mus cuius 


beta-1, 3-N- 

acetylglucosaminyl transferase 


262 


44 


1561 


AL109627 


Homo sapiens 


dJ3 09K20.2 (acrosomal protein 
ACR55 (similar to rat sperm 
antigen 4 (SPAG4) ) ) 


1607 


97 


1562 


AJ131890 


Homo sapiens 


DNA polymerase lambda 


3002 


166 


liOJ 


AJjUj -> H Z *k 


Home* oani pnR 


dA22D12.1 (novel protein 
similar to Drosophila Kelch 
proteins) 


3015 


100 . 


X Do 4 


/■VV— U \J £, *t \J w 


Homo sapiens 


Gene product with similarity 
to Ubiquitin binding enzyme 


2790 


100 


1565 


AC005306 


Homo sapiens 


R27216 1 


919 


82 


156 6 




Caenorhabdi t 
is elegans 


Contains similarity to Pfam 
domain: PF00169 (PH) , 
Score=20.6, E- value-1 . 9e-05 , 
N=l 


550 


45 


13D / 




sapiens 


F-box and WD-repeats protein 
beta-TRCP2 isoform C 


2879 


100 


1568 


D49473 


Mus rausculuB 


truncated form of Soxl7 


1047 


78 


1569 




Homo sapiens 


unnamed protein product 


210 


91 


ID / u 


X7 5756 


Homn Raniens 


protein kinase C rau 


4797 


99 


1571 


AF145713 


Homo sapiens 


SCHIP-1 


2368 


100 


lb 




urobupjiii ai 
melanogaster 


pm (544^ aene oiroduct 


180 


31 


1 C "7 "5 


/vr u /louj 


griseus 

subsp 

griseus 


NonF 


205 


38 


1574 


U28993 


Caenorhabdit 
is elegans 


F22D3 . 3 gene product 


144 


27 


1575 


AF129507 


Homo sapiens 


transcription factor ICBP90 


287 


68 


1 K "7 C 




Homo Raniens 


oxytocin receptor 


2002 


100 


1 CTT 
J.D / / 


AF23 7711 


DlTOSOph 1 1 3. 

melanogaster 


Diablo 


421 


54 


J. b / o 


G00975 


Homo flaiaiens 


Human secreted protein, SEQ 
ID NO: 5056. 


480 


100 


1579 


AF248744 


Cryptosporid 
iunr parvutn 


thrombospondin- related 
adhesive protein 


123 


33 


158 0 


AL121782 


Homo sapiens 


dJ585I14.2 (novel protein 
(translation of cDNA 
Em:AK000219) ) 


6-6-3 


100 


1581 


AF041853 


Homo sapiens 


kinesin family member protein 
KIF3A 


345 


33 


1582 


AF025441 


Homo sapiens 


Opa- interacting protein 0IP5 


1198 


106 


1583 


AE001803 


Thermotoga 
maritima 


glycerate kinase, putative 


349 


34 


1584 


AF2522B3 


Homo sapiens 


Kelch-like 1 protein 


3973 


100 


1585 


AF169675 


Homo 
sapiens 


leucine-rich repeat 
transmembrane protein FLRT1 


3494 


99 


1586 


AF118274 


Homo sapiens 


DNb-5 


2628 


97 


1587 


X79440 


Homo sapiens 


NADP+- dependent malic enzyme 


3167 


99 


1588 


X99802 


Homo sapiens 


ZYG homologue 


3966 


99 


1589 


AF169803 


Homo sapiens 


f lavohemoprotein b5+b5R 


2563 


100 


1590 


Y29861 


Homo sapiens 


Human secreted protein clone 
cb98 4 . 


181 


47 


1591 


Z25535 


Homo sapiens 


nuclear pore complex protein 
hnupl53 


7567 


99 


1592 


X13293 


Homo sapiens 


B-myb protein (AA 1-700) 


3678 


99 


1593 


M74027 


Homo sapiens 


mucin 


242 


27 


1594 


AL139314 


Schizosaccha 
romyces 


hypothetical protein 


235 


54 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


Uc.bv.Kirl J.VJ11 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






pombe 








1595 


W78324 


Homo sapiens 


^i-arrmpnt- nf Hi lmfl n secreted 
protein encoded by gene 81. 


1318 


98 


1596 


Y94906 


Homo sapiens 


Human secreted protein clone 
rb64 9_3 protein sequence SEQ 
ID NO: 18. 


2236 


98 


1597 


AF174605 


Homo sapj.eii» 


F-box protein Fbx25 


1408 


99 


1598 


AB032254 


Homo 
sapiens 


bromodomain adjacent to zinc 
finger domain 2A 


9676 


98 


1599 


X73114 


Homo sapiens 


slow MyBP-C 


5568 


95 


1600 


X82200 


Homo sapiens 


gpStaf50 


2305 


100 


1601 


Y00876 


Homo 
sapiens 


Human LAPH-1 protein 
sequence . 


1149 


98 


1602 


AJ223351 


Homo sapiens 


HIRA-interacting protein 3 2821 


99 


1603 


AJ222801 


Homo sapiens 


neutral sphingomyelinase 


2268 


99 


1604 


AJ222801 


Homo sapiens 


neutral sphingomyelinase 


1601 


99 


1605 


AF185576 


— — * — — — 

Mus musculus 


POZ/zinc finger transcription 
factor ODA-8 


3435 


97 


1606 


AF093744 


Homo sapiens 


unknown 


131 


100 


1607 


A12142 


9ynthetic 
construct 


IFN -pseudo- omega 2 


800 


98 


1606 


Y57949 


Homo sapiens 


Human transmembrane protein 
HTMPN-73. 


1668 


100 


1609 


AF151044 


Homo sapiens 


HSPC210 


681 


97 


1610 


X1521B 


Homo sapiens 


ski protein (AA 1 - 728) 


3765 


100 


1611 


Y03200 


Homo sapiens 


rab geranylgeranyl 
transferase 


2976 


100 


1612 


AF220560 


Homo sapiens 


B/K protein 


2486 


99 


1613 


AC004481 


Arabidopsis 
thaliana 


nodulin-like protein 


371 


26 


1614 


Y09501 


Homo sapiens 


NADH-cytochrome-b5 reductase 


1607 


100 


1615 


Y15521 


Homo sapiens 


start position 1 


3150 


97 


1616 


AJ010750 


Rattus 
norvegicus 


Castration induced prostatic 
apoptosis related protein- 1, 
(CIPAR-1) 


890 


62 


1617 


X58079 


Homo sapiens 


S100 alpha protein 


481 


100 


1618 


Y66678 


Homo 
sapiens 


Membrane -bound protein 
PROl 0 09. 


967 


100 


1619 


AJ242973 


Homo sapiens 


peptide methionine sulfoxide 
reductase 


929 


100 


1620 


AF150733 


Homo sapiens 


AD-014 protein 


288 


100 


1621 


AJ007509 


Homo sapiens 


ElB-55kDa-associated protein 


4646 


98 


1622 


X64177 


Homo sapiens 


metallothionein 


380 


100 


1623 


AE001045 


Archaeoglobu 
s fulgidus 


A. fulgidus predicted coding 
region AF0859 


240 


3 6 


1624 


AL355013 


Schizosaccha 

romyces 

pombe 


mitochondrial carrier protein 


403 


34 


1625 


Y66746 


Homo 
sapiens 


Membrane -bound protein 
PR01198 . 


1184 


100 


1626 


D90053 


Sus scrofa 


destrin 


863 


100 


1627 


j Y35954 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
203 . 


756 


100 


162 8 


AL.0 3 1775 


Homo sapiens 


dJ30M3.2 (novel protein) 


470 


100 


1629 


AF132484 


Mus musculus 


unknown 


286 


68 


1630 


AF017096 


Drosophila 
melanogaster 


similar to C. elegans 
R10H10.6 and S. cerevisiae 
YD8419 .03c 


493 


61 


1631 


X03077 


Homo sapiens 


lactate dehydrogenase -A 


1704 


100 


1632 


AF151084 


Homo sapiens 


HSPC250 


763 


100 


1633 


AJ001874 


Homo sapiens 


orf 


255 


97 


1634 


AC0121B7 


Arabidopsis 
thaliana 


Contains weak similarity to 
GATA-6 DNA-binding protein 
gb|H36135, gb|Z26200 come 
from this gene. 


143 


38 
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SCORE 


IDENTITY 


1635 


AF026246 


Homo sapiens 


HERV-E integrase 


411 


90 


1636 


Y50943 


HOlTlO Seip -1. cJlfe 


Human arinlt" hra i n rHNA clone 

ve8_l derived protein. 


2126 


95 


1 (TIT 




Wr"\mr*» can't ane 


Ti - ni op rol "i r* acid ox i dase 


2068 


99 


1638 


AJ238247 


Mus musculus 


putative phosphatase subunit 


1948 


96 


163 9 


Y94942 


Homo sapiens 


tinman c^a^^aKaH nmHoi n pi /~\ n o 

Human secreceu procein t-iuat; 
yk251 1 protein sequence SEQ 
ID NO: 90. 


1 J zu 


10 0 


1640 


AF23503 0 


Homo sapiens 


BM88 antigen 


766 


99 


1641 


AF233288 


brosophila 
melanogaster 


WDS 


O Do 


2 b 


1642 


M19351 


Mus musculus 


immunoglobulin heavy chain 
binding protein 


145 


34 


1643 


Y70452 


Homo sapiens 


Human membrane channel 
protein-2 (MECHP-2) . 


13 52 


100 


1644 


AF176520 


Mus musculus 


WD repeat -containing F-box 
protein FBW5 


2676 


88 


1645 


W67816 


Homo sapiens 


Human secreted protein 
encoded by gene 10 clone 
HCEMU4 2 . 


1156 


100 


1646 


X67155 


Homo sapiens 


mitotic kinase-like protein-1 


4456 


99 


1647 


M63180 


Homo sapiens 


threonyl-tRNA synthetase 


1040 


61 


1648 


Y87342 


Homo sapiens 


Human signal peptide 
containing protein HSPP-119 
SEQ ID NO: 119. 


1566 


93 


1649 


R95332 


Homo sapiens 


Tumor necrosis factor 
receptor 1 death domain 
iigana \cione jihj . 


413 7 


100 


1650 


AC007136 


Homo sapiens 


Putative map kinase 
interacting kinase 


856 


99 


1651 


ABU 1D34 o 


Homo sapiens 


r ne i c p 
EipSi.DK 


44 64 


Q Q 


1652 


AL161576 


Arabidopsis 

t- hal i ana 

Luai 2 ana 


putative protein 


1341 


48 


1553 


AC0O5313 


Arabidopsis 
thaliana 


putative calmodulin 


288 


28 


1654 


AL03142 8 


Homo sapiens 


□.Jio4uy.x iiviAftubui protein/ 


352 6 


10 0 


1655 


AL031428 


Homo sapiens 


dJ184J9.1 (KIAA0601 protein) 


3526 


100 


1656 


AB017910 


Dictyosteliu 
m discoideum 


myoM 


297 


32 


1657 


Y28919 


Homo 
sapiens 


Human regulatory protein 
HRGP-5 . 


2251 


99 


1658 


AF056191 


Homo sapiens 


TPA inducible protein 


2744 


98 


1659 


U76846 


Arabidopsis 
thaliana 


ubiquitin-specif ic protease 


137 


35 


1660 


AL678627 


Schizosaccha 

romyces 

pombe 


actin-like protein; (2 actin 
domains ) 


320 


34 


1662 


X52022 


Homo sapiens 


collagen type VI, alpha 3 
chain 


16274 


99 


1663 


AF300648 


Homo 
sapiens 


guanine nucleotide binding 
protein beta subunit 4 


1811 


100 


1664 


AF214736 


Homo sapiens 


EH domain containing protein 
2 


2774 


100 


1665 


Z48613 


Saccharomyce 
s cerevisiae 


unknown 


138 


26" 


1666 


AP177385 


Homo 
sapiens 


cytochrome c oxidase assembly 
protein isoform 2 


1395 


99 


1667 


AC007842 


Homo sapiens 


BC331191_1 


1581 


47 


1668 


S67513 


Boma 
disease 
virus BDV, 
WT-1, Halle 
Bl/91, horse 
brain, field 
isolate, 
Peptide, 370 


p4 0 


3 97 


43 
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ACCESSION 
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SPECIES 




SMITH- 
WATERMAN 

SCORE 


IDENTITY 






aa 








1669 


Z99753 


Schizosaccha 

romyces 

pombe 


putative NOLI -N0P2- sun family 
nucleolar protein 


569 


47 


1670 


G03130 


Homo s ap i en s 


Ui l ma n a&nrf*i~&ci nrnt" fin SEO 
nurnan oci.ici.cu uaulcau, <j"v 

ID NO: 7211. 


427 


97 


1671 


M96625 


Gallus 
gallus 


cardiac muscle tensin 


1185 


54 


1672 


AF174482 


Homo sapiens 


polycomb 3 


2 0 05 


99 


1673 


Y51846 . 


Homo sapiens 


Human 18.1 homo log protein 
fragment . 


233 


29 


1674 


AF255334 


Homo sapiens 


EXP35 


152 


2 9 


1675 


Y94B67 


Homo i 
sapiens ■ 


Human protein clone HP10563 . 


109 


30 


1676 


Y25712 


Homo sapiens 


Human secreted protein 
encoded from gene 2 . 


3043 


99 


1677 


Y25712 


Homo sapiens 


Human secreted protein 
encoded from gene 2 . 


1580 


91 


1678 


AF163151 


Homo sapiens 


dentin sialophosphoprotein 
precursor 


170 


1 7 


1679 


AF163151 


Homo sapiens 


dentin sialophosphoprotein 
precursor 


170 


17 


1680 


AK024453 


Homo sapiens 


FLJ00045 protein 


1349 


100 


1681 


AF019236 


Dictyosteliu 
m discoideum 


TipD 


(Til 




1682 


AJ243459 


Leishmania 
major 


proteophosphoglycan 


153 


26 


1683 


Z69369 


Schizosaccha 

romyces 

pombe 


putative GTP-binding protein 


560 


46 


1684 


X94910 


Homo sapiens 


ERp 2 8 


1334 


100 


1685 


AF286475 


Takifugu 
rubripes 


retinitis pigmentosa GTPase 
reguxacoiT" li^e proteui 


196 


19 


1686 


AF191298 


Homo sapiens 


VaCuOidi sux Liny ^jxucci-ii -3-* 


4087 


100 


1687 


AJ275986 


Homo sapiens 


transcription factor 


2956 


100 


1688 


AJ275986 


Homo sapiens 


transcription factor 


18 86 


88 


1689 


X07311 


Drosophila 
melanogaster 


heat shock, protein 


13 8 


43 


1690 


AF240463 


Rattus 
norvegicus 


.Llbi- interact mg prouein 
NUDE1 


1383 


83 


1691 


AJ272078 


Homo sapiens 


A nnorr 1 T Dh^mnl at"inrr rT^PI Tl 

ArUcJlL 1 B t lulUl«t iJiy jJiuuciii 


1256 


68 


1692 


AJ272079 


Homo sapiens 


7s Q^rj t/*' i o f~ -1 mi i '"I = t - H nn nrnt"fin 
Aj~\Jt5£j\r 1 BuluiUloUlliy piuicixi 


1336 


60 


1693 


AF177942 


Xenopus 
laevis 


katanin p60 


1664 


66 


1694 • 


AF263539 


Homo sap iens 


arginme iv — rnc unyi v-j-a.iitix.r2L <* o e 


1774 


100 


1695 


AF2226 89 


Homo 
sapiens 


protein acyiniuc w 

m»t-Wl r-TAnnf ^rase 1 -variant 2 


1182 


81 


1696 


AKUUU193 


Homo sapiens 


unnamed protein product 


1060 


100 


1697 


AB041035 


Homo sapiens 


kidney superoxide-producing 
NADPH oxidase 


3122 


100 


1698 


AB041035 


Homo sapiens 


kidney superoxide-producing 


2181 


100 


1699 




Homo sapiens 


r?H2 ?inc finaer nrotein 


488 


54 


1700 


Y4 4676 


Homo sapiens 


Human ARF-Related Protein- 1 
(HARP-1) . 


938 


97 


1701 


AKC22407 


Homo sapiens 


unnamed protein product 


315 


98 


1702 


AB024574 


Homo sapiens 


GTP-binding like protein 2 


1172 


100 


1703 


AF05507B 


Homo sapiens 


zinc finger protein 42 


421 


52 


1704 


AF198092 


Kus mus cuius 


RP42 


1057 


77 


1705 


AE003573 


Drosophila 
melanogaster 


CG12474 gene product 


161 


33 


1706 


AB036345 


Drosophila 
melanogaster 


aquaponn 


164 


24 


1707 


Y55927 


Homo sapiens 


Human STLK2 protein. 


2146 


100 


170 8 


U27121 


Damo rerio 


G12 


212 


47 


1709 


AL391710 


Arabidopsis 


putative protein 


505 


SO 
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thaliana 








1710 


B01311 


Homo sapiens 


Human PR0241 polypeptide. 


1649 


97 


1711 


U40750 


~ — s — n — 

Mus musculus 


formin binding protein 30 


4561 


85 


1712 


AJ011118 


Mus musculus 


skeletal muscle and cardiac 
protein 


1490 


89 


1713 


AF255303 i 


Homo 
sapiens 


membrane-associated nucleic 
acid binding protein 


4416 


99 


1714 


AF255303 


Homo 
sapiens 


membrane-associated nucleic 
acid binding protein 


2960 


100 


1715 


U08227 


Rattus 
norvegicus 


Ras-related protein 


511 


51 


1716 


AF168795 


Rattus 
norvegicus 


schlaf en-4 


1129 


44 


1717 


AF196304 


Homo sapiens 


SUMO- 1- specific protease 


5804 


99 


1718 


AL355737 


Homo sapiens 


HMG20A 


1782 


100 


1719 


AB029333 


Halocynthia 
roretzi , 


HrPET-1 

1 


1069 


46 


1720 


AF071317 


Mus reus cuius 


C0P9 complex subumt 7b 


/ 




1721 


AJ272215 


Homo sapiens 


HEYL protein 


1 C Q 1 
lbol 


Q Q 


1722 


G01982 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6063 . 


718 


100 


1723 


AL032643 


Caenorhabdit 
is elegans 


similar to Uncharacterized | 
protein family UPFO034, 


825 


41 


1724 


G01972 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6053 . 


cad' 
bob 


92 


1725 


Y94441 


Homo 
sapiens 


Human Adipose Specific 
Protein 1. 


1231 


inn 
1UU 


1726 


AF255443 ' Homo sapiens 


CGI-201 protein 


4 397 


Q Q 


1727 


AF183426 


Homo sapiens 


HT004 protein 


1810 


9 9 


1728 


D10884 


Bos taurus 


neurocalcin 


1002 


99 


1729 


Z18529 


Gallus 
gallus 


tensin 


1411 


8 4 


1730 


Z73423 


Caenorhabdit 
is elegans 


cDNA EST EMBL:Z14 908 comes 
from this gene-cDNA EST this 
gene 


233 


41 


1732 


AF090891 


Homo sapiens 


PR66105" 


470 


30 


1733 


AJ277724 


Homo sapiens 


histone deacetylase 8 


2 015 


inn 
1 u u 


1734 


G04050 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8131. 


503 


95 


1735 


D45913 


Mus mus cuius 


leucine-rich-repeat protein 


3531 


94 


1736 


AF096709 


Drosophila 
virilie 


failed axon connections 
protein 


276 


32 


1737 


AF195120 


Homo sapiens 


dynactin p62 subunit 


2417 


99 


1738 


L15314 


Caenorhabdit 
is elegans 


contains similarity to Pfam 
family PF01772 N=l 


2 06 


3 7 


1739 


X54618 


Listeria 

monocytogene 

s 


phosphadidyl inositol specific 
phoBpholipase C 


134 


27 


1740 


AL031658 


Homo sapiens 


dJ3l0O13.4 (novel protein 
similar to predicted C. 
elegans an C. intestinalis 
proteins) 


123 


31 


1741 


Y35924 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
173 . 


1013 


99 


1742 


AC013354 


Arabidopsis 
thaliana 


F15H18.15 


202 


32 


1743 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD08. 


1932 


59 


1744 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD08 . 


1854 


61 


1745 


AF221098 


Homo 
sapiens 


Ral guanine nucleotide 
exchange factor RalGPSIA 


1224 


70 


1746 


Y99372 


Homo sapiens 


Human PRO143 0 (UNQ736) amino 
acid sequence SEQ ID NO: 116. 


1332 


99 


1747 


Y94294 


Homo sapiens 


Human coenzyme A-utilising 


842 


100 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








enzyme CoAEN-2. 






1748 


AK02443 6 


Homo sapiens 


FLJ00026 protein 


1619 


10 0 


1749 


AE000877 


Methanobacte 
riuni 

thermoautotr 
ophicum 


conserved protein 


231 


36 


1750 


AF101361 


Drosophila 
melanogaster 


Abnormal X segregation 


193 


33 


1751 


Y15067 


Homo sapiens 


ZNF23 2 


889 


10 0 


1752 


AF25103 8 


Homo sapiens 


GAP- like protein 


822 


10 0 


1753 


AC003093 


Homo sapiens 


OXYSTEROL- BINDING PROTEIN; 
45* similarity to P220b9 
(PID-.gl29308) 


352 


57 


1754 


X69089 


Homo sapiens 


165kD protein 


5703 


99 


1755 


AL049795 


Homo sapiens 


dJ622L5.3 (novel protein) 


10*9 


100 


1756 


AL031393 


Homo sapiens 


dJ733D15.1 (Zinc-finger 
protein) 


2765 


100 


1757 


AB040672 


Homo sapiens 


UDP-GalNAc: polypeptide N- 
acetylgalactosaminyl transf era 
se 


2020 


99 


1758 


AL022238 


Homo sapiens 


dJ1042K10.4 {novel protein) 


776 


43 


1759 


AF117653 


Homo sapiens 


double homeobox protein 


375 


54 


1760 


Y126S5 


Homo sapiens 


hNop56 


2959 


99 


1761 


AL049712 


Homo sapiens 


dJ686C3.2 (nucleolar protein 
hNop56) 


2595 


99 


1762 


AC002394 


Homo 
sapiens 


Gene product with similarity 
to dyne in beta subunit 


1542 


51 


1763 


AF169017 


Homo sapiens 


formiminotransf erase 
cyclodeaminase 


877 


100 


1764 


U91541 


Homo sapiens 


human formiminotransf erase 
cyclodeaminase {f ted) protein, 
carboxy- terminal end 


596 


100 


1765 


AB013365 


Bacillus 
halodurans 


YlqF 


350 


34 


1766 


Y38421 


Homo sapiens 


Human secreted protein 
encoded by gene No. 36. 


14 5 


71 


1767 


AC009176 


ArabidopsiB 
thaliana 


putative ribulose-1 , 5- 
bisphosphate 

carboxylase/oxygenase small 
subunit N-methyltransf erase I 


216 


27 


1768 


AKG006 4 / 


Homo sapiens 


unnamed protein product 


73 7 


qq 


176 9 


AJ238982 


Homo sapiens 


VNN3 protein 


2665 


9 9 


1770 


U73522 


Homo sapiens 


AMSH 


1214 


56 


1771 


U89435 


Mus musculus 


unknown 


829 


86 


1772 


S70011 


Rattus sp . 


tricarboxylate carrier 


16 04 


95 


1773 


AL035086 


Homo sapiens 


au44A20.2 (novel protein) 


203 6 


100 


1774 


Y99426 


Homo sapiens 


Human PRO1604 (UNQ785) amino 
acid sequence SEQ ID NO: 3 08. 


1057 


99 


1775 


AF110330 


Homo sapiens 


glutaminase 


3146 


100 


1776 


AJ269529 


Homo sapiens 


glycerol 3 -phosphate permease 


2787 


100 


1777 


Z81579 


Caenorhabdi t 
is elegans 


cDNA EST yk76fl.5 comes from 
this gene 


232 


3 1 


1778 


AY007239 


Homo sapiens 


monooxygenase X 


1875 


99 


1779 


AL109608 


Schizosaccha 

romyces 

pombe 


oxysterol -binding protein 
family 


644 


3 8 


1780 


AF254260 


Homo sapiens 


tuftelin 1 


1729 


100 


1781 


L07924 


Mus musculus 


guanine nucleotide 
dissociation stimulator 


247 


50 


1782 


AF295773 


Homo 
sapiens 


ral guanine nucleotide 
dissociation stimulator 


142 


49. 


1783 


AK024475 


Homo sapiens 


FLJ0 0 068 protein 


4333 


100 


1784 


AK024475 


Homo sapiens 


FLJ000S8 protein 


3996 


93 


1785 


G03933 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8014 . 


570 


100 


1786 


S82G37 


Homo sapiens 


Ig lambda- like gene/beta- 


247 


100 
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ID 
NO: 
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SPECIES 


DESCRIPTION 
glucuronidase exon 11 homolog 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 
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TABLE 3 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 


2 


BL00240 


Receptor tyrosine kinase 
class III proteins. 


BLO0240B 24.70 8.250e- 
12 157-181 


3 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109D 17.04 B.085e- 
13 358-381 


4 


BL00028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 9.40Ce- 
10 1129-1146 BL00028 
16.07 1.257e-09 820- 
837 


5 


BL00023 


Type II fibronectin 
collagen-binding domain 
proteins . 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


6 


BL00023 


Type II fibronectin 
collagen-binding domain 
proteins . 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


7 


BL00023 


Type II fibronectin 
collagen-binding domain 
proteins . 


BL00023 24.31 B.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


8 


BL00023 


Type II fibronectin 
collagen-binding domain 
proteins . 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


9 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 5.119e- 
09 863-917 


10 


PR00464 


E- CLASS P450 GROUP II 
SIGNATURE 


PR00464D 17.40 6.182e- 
12 294-312 PR00464G 
12.41 4.231e-ll 377- 
393 


11 


PR00734 


GLYCOSYL HYDROLASE 
FAMILY 7 SIGNATURE 


PR00734I 11.46 4.296e- 
09 502-520 


12 


PF00023 


Ank repeat proteins. 


PF00023B 14.20 6.500e- 
10 89-99 PF00023B 
14.20 2.636e-09 56-66 


14 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 3.848e- 
09 79-113 


15 


PR00208 


GLIADIN AND LMW GLUTENIN 
SUPERFAMILY SIGNATURE 


PR00208A 12.59 9.868e- 
10 517-535 PR0020BA 
12.59 2.233e-09 520- 
538 


17 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 8.200e- 
14 282-295 PD00066 
13.92 9.400e-14 477- 
490 PD00066 13.92 
6.5O0e-13 505-518 
PD00066 13.92 9.500e- 
13 254-267 PD00066 
13 . 92 1.429e-12 393- 
406 PD00066 13.92 
6.571e-12 421-434 


18 


BL00845 


CAP-Gly domain proteins. 


BL00845 16.43 2.200e- 
25 55-80 


20 


BL00487 


IMP dehydrogenase / GMP 
reductase proteins. 


BL004B7E 16.12 5.737e- 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL00487G 26.82 
4.082e-12 287-329 


21 


BL00487 


IMP dehydrogenase / GMP 
reductase proteins. 


BL00487E 16.12 5.737e- 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL00487G 26.62 
4.082e-12 348-390 


22 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 3.250e- 
26 302-333 
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SSQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 


23 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 3 . 250e- 
26 302-333 


25 


BL00115 


Eukaryotic RNA 
polymerase II 
heptapeptide repeat 
proteins. 


BL00115T 8.45 7.273e- 
29 1208-1242 BL00115Q 
18.08 2.776e-21 953- 
983 BL00115Y 11.86 
B.000e-17 1604-1650 
BL00115M 19.19 8.130e- 
16 731-774 BL0011SH 
14.34 9.392e-16 463- 
496 BL00115A 15.44 
7.414e-15 43-82 1 
BL00115R 6.50 6.12Be- 
14 983-1010 BL00115J j 
16.71 9.289e-14 591- 
617 BL00115I 8.33 
4.336e-13 535-590 
BL00115L 12.25 5.939e- 
13 662-694 BL00115G 
11.65 6.011e-13 435- 
463 BL00115K 15.03 
3.417e-10 617-659 
BL00115O 16.76 5 . 805e- 
10 863-913 BL00115P 
11.54 7.538e-10 913- 
953 BL00115S 18.24 
7.968e-10 1010-1052 
BL00115U 10.34 4.475e- 
09 1242-1265 


26 


BL00420 


Speract receptor repeat 
proteins domain 
proteins . 


BL00420A 20.42 4.109e- 
11 81-110 BL00420A 
20.42 8.820e-10 84-113 


27 


BL00050 


Ribosomal protein L23 
proteins . 


BL00050A 23 .71 9.250e- 
27 94-127 BL00050B 
14.81 B.125e-12 133- 
147 


28 


PR00925 


NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY 
SIGNATURE 


PR00925B 3.73 3.089e- 
10 41-54 


29 


PF0O756 


Putative esterase. 


PF00756C 14 .12 1.108e- 
09 486-516 


32 


BL00557 


FMN- dependent alpha - 
hydroxy acid 
dehydrogenases proteins. 


BL00557D 17.76 5.065e- 
37 274-316 BL00557A 
35.08 8.909e-29 24-73 
BL00557C 15.59 l.OOOe- 
28 227-257 BL00557B 
21.27 8.898e-22 130- 
169 


34 


PR00629 


SHC PHOSPHOTYROSINE 
INTERACTION DOMAIN 
SIGNATURE 


PR00629E 9.90 5.886e- 
35 299-328 PR00629F 
10.95 8.364e-32 334- 
361 PR00629B 13.66 
3.786e-27 224-247 
PR00629A 13.45 8.364e- 
21 206-222 PR00629C 
3 .80 4 . 000e-12 249-261 
PR00629D 12.45 3.739e- 
11 276-286 


35 


PD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PD01270A 17.22 l.OOOe- 
40 39-79 PD01270B 
22.18 2.875e-38 94-131 
PD01270D 24.66 3 . 700e- 
34 171-207 PD01270C 
19.54 3 .455e-30 137- 
166 


36 


PD0127O 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PD01270A 17.22 l.OOOe- 
40 39-79 PD01270B 
22.18 2.875e-38 94-131 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


Dt?OTTT TC* 
KtoULlo" 








PD01270D 24 .66 3 . 700e- 
34 171-207 PD01270C 
19.54 3.455e-30 137- 
166 


37 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412C 10.28 9 . 241e- 


38 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412C 10.28 9.241e- 
10 264-298 


39 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412C 10.28 9.241e- 
10 264-29B 


40 


PRO038O 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380B 12.64 7.366e- 
14 342-360 PR00380C 
13. IB 6.927e-13 375- 
394 PR003B0D 9.93 
2.180e-12 429-451 
PR00380A 14.18 5.154e- 
12 143-165 


44 


BL00345 


Ets-domain proteins. 


BL00345B 21.28 l.OOOe- 
40 239-290 BL00345A 
13.96 2.452e-l4 204- 
223 


45 


BL00345 


Ets-domain proteins. 


BL00345B 21.28 l.OOOe- 
40 215-266 BL00345A 
13.96 2.452e-l4 180- 
199 


46 


DM01551 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551A 15.63 3.538e- 
26 172-202 DM01551C 
14.62 3.571e-l7 232- 
252 DM01551B 8.84 
4 . 750e-ll 214-226 


47 


PR00876 


NEMATODE METALLOTKIONE IN 
SIGNATURE 


PR00876B 7.66 9.328e- 
11 246-260 


4B 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.231e- 
33 6-45 


50 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 7.750e- 
19 994-1019 BL00972A 
11.93 7.120e-18 216- 
234 BL00972E 20.72 
9.471e-14 1020-1042 
BL00972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 8.269e-10 302-312 


51 


BL0C972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 7.750e- 
19 990-1015 BL00972A 
11.93 7.120e~l8 216- 
234 BL00972E 20.72 
9.471e-14 1016-1038 
BL00972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 8.269e-10 302-312 


52 


BL01115 


GTP -binding nuclear 
protein ran proteins . 


BL01115A 10.22 3.063e- 
14 10-54 


53 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6 .39 8.500e- 
17 20-38 PR00988F 
12.23 7.B28e-15 196- 
210 PR00988C 13.64 
6.108e-14 104-120 
PR009B8E 8 .27 3 .872e- 
11 174-186 PR00988D 
5.95 6.878e-10 160-171 
PR009B8B 11.60 2.915e- 
09 57-69 


55 


PR00762 


CHLORIDE CHANNEL 
SIGNATURE 


PR00762C 9.29 4.682e- 
21 294-314 PR00762D 
11.29 4.103e-19 509- 
530 PR00762A 14.22 
9.333e-18 199-217 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PR00762F 15.12 3.100e- 
16 563-583 PR00762B 
12.12 6.063e-l6 230- 
250 PR00762E 12.07 
2.286e-15 545-562 
PR00762G 14.13 6.276e- 
13 601-616 


56 


BL00216 


Sugar transport 
proteins . 


BL00216B 27.64 8.800e- 
10 153-203 


58 


PF00791 


Domain present in ZO-l 
and Unc5-like netrin 
receptors. 


PF00791B 28.49 2.049e- 
10 1080-1135 


59 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors , 


PF00791B 28.49 2.049e- 
10 1062-1117 


61 


PD01929 


KINASE TYPE RESISTANCE 
ANTIBIOTIC TRANSFERASE 
AM. 


PD01929E 10.76 9.01Be- 
09 206-221 


68 


PR00360 


C2 DOMAIN SIGNATURE 


PR0036QA 14 ^9 7 395e- 
09 680-693 


69 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 7.395e- 
09 670-683 


70 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 8.714e- 
10 51-64 


72 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 5.304e- 
09 108-118 


73 


BL00239 


Receptor tyrosine kinase 
class II proteins. 


BL00239B 25.15 7.075e- 
12 118-166 


74 


BL00790 


Receptor tyrosine kinase 


BL00790N 13.25 6.116e- 


76 


DM00471 


0 PROKARYOTIC DNA 
TOPOISOMFRAQR T 


DM00471A 11.73 9.357e- 

X J D J D 0 JJP1U Kit 1 XO 

8.45 4.S57e-12 70-81 


8 0 




PHOS PHATI DYLSER INE . 


tr JJUZ o / OL o . OU i • ' i-3c 

13 223-236 PD02876D 
12.13 2.588e-l2 334- 
3 51 


81 


PD02876 


DECARBOXYLASE 
PHOSPHATIDYLS ERINE . 


PD02876C 8.80 2.723e- 
13 282-295 PD02876D 
12.13 2,588e-l2 393- 
410 


83 


BLG0708 


Prolyl endopeptidase 
family serine proteins . 


BL00708B 24.91 7.197e- 
12 570-601 


84 


PR00014 


FIBRONECTIN TYPE III 
REPEAT SIGNATURE 


PR00014C 15.44 8.043e- 
09 985-1004 


86 


PR00678 


PI 3 KINASE P85 
REGULATORY SUBUNIT 
SIGNATURE 


PR00678H 9.13 1.379e- 
09 246-269 


89 


PR00320 


G- PROTEIN BETA WD-4 0 
REPEAT SIGNATURE 


PR00320C 13 .01 8 .200e- 
09 264-279 PR00320B 
12.19 8.650e-09 264- 
279 


93 


BL00455 


Putative AMP-binding 
domain proteins. 


BL00455 13.31 2.588e- 
14 316-332 


95 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.000e- 
10 123-154 


96 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.000e- 
10 212-243 


97 


PR00081 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.31Se- 
13 134-146 PR00081A 
10.53 2.500e-12 54-72 


98 


PR0O380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 5.500e- 
24 401-423 PRC0380D 
9.93 7.188e-20 613-635 
PR00380B 12.64 7.517e- 
16 529-547 PR00380C 
13.18 2.756e-13 560- 
579 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


102 


PR00300 


ATP -DE PENDENT CLP 
PROTEASE ATP -BINDING 
SUBUNIT SIGNATURE 


PR00300A 9 .56 7.545e- 
14 289-308 


104 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479B 12.57 6.786e- 
18 298-314 BL00479A 
19.86 4.913e-16 155- 
178 BL00479A 19.86 
4.300e-13 272-295 
BL00479B 12 .57 6.294e- 
12 181-197 


106 


BL01019 


ADP-ribosylation factors 
family proteins . 


BL01019A 13 .20 8.013e- 
12 43-83 


107 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.6"0 S.OOOe- 
16 403-416 


108 


BL00191 


Cytochrome b5 family, 
heme — b i nd i ng domain 
proteins . 


BL00191K 17.38 4.951e- 
27 238-282 BL00191J 
11.37 6.447e-17 182- 
204 


109 


PDQ1 066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.938e- 
37 8-47 


110 


BL01138 


Scorpion short toxins 
proteins . 


BL01138A 10 .96 B.297e- 
10 38-50 


113 


RT.n n i 07 

D J_lU \J X \J 1 


Prof pi n VinnQpc! ATP- 

binding region proteins. 


BL00107A 18.39 5.800e- 
23 156-187 BL00107B 
13.31 9.100e-14 225- 
241 


117 


BL00214 


Cytosolic fatty-acid 
binding proteins . 


BL00214B 26.51 l.OOOe- 
17 46-91 BL00214A 
21.17 7.052e-ll 5-31 


118 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 8.560e- 
13 36-67 


119 


PR00529 


GONADOTROPH IN RELEASING 
HORMONE RECEPTOR 
SIGNATURE 


PR00529C 11.03 7.506e- 
10 158-177 


120 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13 .01 9.400e- 
09 80-95 


121 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 9.400e- 
09 80-95 


127 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 7.158e- 
13 216-241 


128 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032C 6.14 3.195e- 
12 147-157 BL01032H 
11.25 5.680e-ll 318- 
331 BL01032G 8.33 
8.932e-ll 282-296 
BL01032I 10.42 8.902e- 
09 379-389 


129 


BL01310 


ATP1G1 / PLM / MAT 8 
family proteins. 


BL01310 14.74 6.694e- 
26 28-64 


130 


PR00990 


RIBOKINASE SIGNATURE 


PR00990B 12.32 9.534e- 
15 47-67 PR00990A 
16.23 5.500e-14 20-42 
PR00990C 12.62 2.412e- 
09 119-133 


133 


BL00880 


Acyl-CoA-binding 
protein. 


BL00880 17.52 5.576e- 
26 72-122 


134 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 9.308e- 
14 18-37 


135 


PR00215 


NEUROMODULIN SIGNATURE 


PR00215C 13.98 6 . 779e- 
10 475-496 


136 


BLO1310 


ATP1G1 / PLM / MAT 8 
family proteins. 


BL01310 14.74 2.432e~ 
29 71-107 


140 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BLO0O28 16.07 7.882e- 
14 214-231 BL00028 
16.07 9.471e-14 102- 
119 BL00028 16.07 
2 .800e-13 18-35 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








13 74-91 BL00028 
16.07 9.l00e-13 186- 
203 BL00028 16.07 
8.043e-12 46-63 
BL00028 16.07 8.435e- 
12 130-147 BL00028 
16.07 9.217e-12 270- 
287 BL00028 16.07 
6.192e-ll 242-259 
BL00028 16.07 4.000e- 
10 158-175 


141 


BL00501 


Signal peptidases I 
serine proteins. 


BL00501D 16.69 9.538e- 
14 113-133 BL00501C 
9.61 8 .688e-10 89-101 


143 


BL01020 


SARI family proteins. 


BL01020C 15.35 7.722e- 
20 79-130 


146 


PD01066 PROTEIN ZINC FINGER 
| ZINC -FINGER METAL - 
BINDING NU. 


PD01066 19. 4.3 b.4UUe- 
25 335-374 


149 


BL00126 : 


3' 5' -cyclic nucleotide 
phosphodiesterases 
proteins . 


BL00126C 22.07 1.450e- 
25 509-550 BL00126E 
35.22 3.951e.-16 654- 
709 BL00126D 25-50 
1.360e-15 565-604 
BL00126B 15.20 8 . 200e- 

27 .56 8 . 269e-ll 442- 


151 


BL00632 


Ribosomal protein S4 
proteins . 


BL00632 23.79 S.271e- 
20 106-149 


154 


BL00559 


Eukaryotic molybdopterin 

oxidoreductases 

proteins. 


BL00559I 13.63 5.304e- 
19 29-58 BL00559K 
13.17 2.957e-18 172- 

8.3B5e-13 99-151 
BL00559L 13.60 5.814e- 
12 241-259 


155 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


FKUU44JA 1J I-"" 

13 13-35 


157 


BL00406 


Actins proteins. 


BL00406D 12.58 2.547e- 
to onc.-'l'KO BL00406A 
9.95 5.776e-l6 15-50 
BL00406B 5.47 7.429e- 
12 69-124 BL00406C 
6.75 9.682e-l2 128-183 


160 


BL00132 


Zinc carboxypeptidases, 
zinc-binding region l 
proteins . 


BL00132A 26.07 7.000e- 
14 22-63 BL00132C 
21.35 3.466e-12 104- 
145 


165 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR 0010 9B 12 27 9.043e- 
13 139-158 


168 


BL00362 


Ribosomal protein S15 
proteins . 


BL00362 24.67 9.700e- 
15 129-172 


169 


BL00039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins . 


BL00039D 21.67 l.OOOe- 
35 640-686 BL00039A 
18.44 1.964e-13 212- 
251 BL00039B 19.19 
4.553e-13 378-404 
BL00039C 15.63 8 .773e- 
12 465-489 


175 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.721e- 
12 14-36 


178 


BL01310 


ATP1G1 / PLM / MAT 8 
family proteins. 


" BL01310 14.74 2.432e- 
29 133-169 


179 


PD0106b 


protein" ZINC Finger 

ZINC- FINGER METAL- 


PDOlOSb" 19.43 9.455e- 
36 6-45 
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SEQ ID NO: 


ACCESSION | 
NO. 


DESCRIPTION 








BINUIISHj viu . 




180 


PR00007 


COMPLEMENT ClQ DOMAIN 


PR00007B 14.16 7.429e- 

19.33 4.938e-19 133- 
160 PR00007C 15 60 
1.225e-15 206-228 
PR00007D 9.64 6.885e- 
11 238-249 


181 


tJi_iU U U z / 


proteins . 


BL00027 26.43 9.526e- 
24 280-323 


182 


BL00027 


1 Homeobox ' domain 
proteins . 


BL00027 26.43 9.526e- 
24 263-306 


183 


BL00027 


' Homeobox ' domain 
proteins . 


BL00027 26.43 9.526e- 
24 280-323 


184 


BL00027 


•Homeobox' domain 
proteins . 


BL00027 26.43 9.526e- 
24 263-306 


188 


PR00929 


AT -HOOK- LIKE DOMAIN 
SIGNATURE 


PR00929C 5.26 3.328e- 
09 460-471 


189 


PR00929 


AT- HOOK- LIKE DOMAIN j 
SIGNATURE 


PR00929C 5.26 3.328e- 
09 440-451 


190 


BL00383 


Tyrosine specific 
protein phosphatases 
proteins . 


BL00383F 15.51 7.188e- 
17 666-682 BL00383A 

13 .34 8.714e-17 162- 
177 BL00383E 10.35 

I. 000e-14 333-344 
BL00383E 10.35 7.300e- 

14 628-639 BL00383F 
15.51 1.720e-13 371- 
387 BL00383C 10.10 
3.000e-13 217-228 
BL00383D 11.92 7.000e- 
13 295-308 BL00383B 
7.61 1.692e-ll 187-196 
BL00383C 10.10 1.750e- 
no c n o oft dt ftdi qth 

II. 92 4.000e-09 589- 
602 BL00363B 7.61 
8.000e-09 479-488 


191 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR004dOC 12.22 7.31le- 
15 83-105 PR00450C 
12 .22 6.286e-13 47-69 


193 


PF00S64 


Octicosapept ide repeat 
proteins . 


T5T? A A C 12 OA 1 A C 1 C A m. 

16 227-278 


194 


PR00503 


BROMODOMAIN SIGNATURE 


DDnncmr oa hi o i eta 
15 204-224 PR00503B 

_7 . J D -J . -J ' -i. C -i-J -L / V J. 0 / 


195 


BL00901 


Cysteine 

synthase/ cystathionine 
beta -synthase P- 

pnOBpiiGluc all * 


BL00901C 20.63 3.429e- 
18 67-117 


197 


BL00636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 6.211e- 
17 40-57 BL00636B 
15.11 2.000e-13 67-88 


198 


PR00690 


ADHESIN FAMILY SIGNATURE 


PR00690A 10.86 9.866e- 
09 463-482 


199 


BL01131 


Ribosomal RNA adenine 


BL01131A 26 .62 2.343e- 
12 84-130 


201 


PR00910 


LUTEOVIRUS 0RF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.352e- 
12 509-522 


2C3 


DM00215 


PROLINE-RICH PROTEIN 3 . 


DM00215 19.43 2.286e- 
10 39-72 


206 


PR00261 


LOW DENSITY LIPOPROTEIN 
(LDL) RECEPTOR SIGNATURE 


PR00261A 11.02 4.462e- 
19 65-87 PR00261C 
11.37 9.308e-19 65-87 
PR00261D 12.47 2.667e~ 
18 65-B7 PR00261B 
14.12 4.000e-18 143- 
165 PR00261A 11.02 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 








4.833e-18 143-165 

JrKUU<:olU IZ.Hf /.bOUe- 

IB 143-165 PR00261B 
14.12 5.065e-16 65-87 
PR00261C 11.37 B.967e- 

ID l*tj-lo3 rKOUiDir 

11.57 4.938e-13 143- 
165 PR00261E 11.08 
7.188e-13 65-87 
PR00261F 11.57 7.18Be- 
13 65-87 PR00261E 
11.08 1.643e-ll 143- 
i 

XOj 


209 


PF00791 


Domain present in ZO-l 

aw/4 T Tt*i f-i ^ _ 1 i f- "y" t n 

ana uncD-iiJte neuiin 

receptors. 


PF00791B 28.49 6 . 143e- 

ij X 1 a - -L / -3 rr UU >7X\. 

20.98 7.680e-10 132- 
171 


211 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007A 19.33 5.7Ble- 
19 131-158 PR00007B 
14.16 4.115e-18 158- 
178 PR00007C 15.60 
1.675e-15 201-223 
PR00007D 9.64 7.231e- 

JL X i 


212 


BL00183 


Ubiqui tin- conjugating 
enzymes proteins. 


BL00183 28.97 1.545e- 
30 43-91 


Oil 


DLiU VI OJ 


Ubiqui t in- con j u gating 
enzymes proteins. 


30 43-91 






DEAD- box subfamily ATP- 
dependent helicases 
protcj.ri3 . 


dt nnnion it i onn» 

bLiUuujyiJ ^ i . b / l.yuue- 
29 568-614 BL00039A 

10 .44 1 . O / J.C-ZJ Zl"OU 

BL00039C 15.63 1.720e- 

11 Jul JOu DLiUUU J _7D 

19.19 4.064e-ll 277- 
303 


217 


BL00100 


Chlorampheni col 
acetyl transferase 
proteins . 


BL00100D 17.22 8.484e- 
09 68-106 


219 


PR00213 


MYELIN P0 PROTEIN 
SIGNATURE 


PR00213C 15.94 3.969e- 
11 199-227 


222 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins . 


BL00678 9.67 1.947e-09 
144-155 


224 


PR00875 


MOLLUSC METALLOTHIONE IN 
SIGNATURE 


PR00875A 5.83 l.OOOe- 
09 901-913 


225 


BL00636 


Nt-dnaJ domain proteins. 


BL00635B 15.11 8.200e- 
19 18-39 


226 


BL00636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 l.OOOe- 
21 21-38 BL00636B 

13.11 0.*UUC-l3 i3"OD 


229 


PR00301 


7 0 KD HEAT SHOCK PROTEIN 


PR00301F 13.98 7.563e- 

13 .78 4.300e-12 361- 
382 


230 


BL00460 


Glutathione peroxidases 

ep 1 enrtpvo Koine nvfth oi r q 


BL00460A 28.67 8.773e- 

ZU jO /U DbUU^ DUd 

9.73 7.429e-l6 78-96 
3L00460C 14.35 2.831e- 
12 111-134 BLO046OD 
16.89 8.773e-ll 140- 
160 


231 


PR00647 


SENR ORPHAN RECEPTOR 
SIGNATURE 


PR00647B 10.19 8.522e- 
09 273-287 


233 


BL00292 


Cycling proteins. 


BL00292B 20.31 7.429e- 
27 244-275 BL00292A 
22.87 7.750e-27 201- 
235 


234 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 6.308e- 
13 7-29 PR00449C 
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SEQ ID NO; 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








17.27 4.462e-ll 47-70 
PR0044QD TO 7 Q 7 ITOp- 

11 109-123 


235 




T .TTTr'T'M'P - T? T PT4 RPPPtT 

bc.ULJ.lNL JXJ. RCirCirtl 

SIGNATURE 


ri\ v u \j x z> J-J x x - o o / . j \j uvz 

10 251-265 PR00019B 
11.36 5.320e-09 119- 
133 PR00019B 11.36 
1.000e-08 229-243 


236 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 7.300e- 
10 245-259 PR00019B 
11.36 5.320e-09 113- 
127 PR00019B 11.36 
1.000e-08 223-237 




Dnn n *? Q qj 

r U\J U Z O J 


PRDTFTN DOMATN 
r RUiCjiii on -j uurmiii 

REPEAT PRESYNA. 


PD00289 9.97 8.448e-09 
67-81 


24 0 


PR00011 


TYPE III EGF-LIKE 
S IGNATURE 


PR00011D 14.03 3.492e- 
10 616-635 


241 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011D 14.03 3.492e- 
10 616-635 


244 


BL00903 


Cytidine and 
deoxycy t i dy la t e 
deaminases zinc-binding 
region s . 


BL00903 12.93 8.941e- 
12 54-64 


245 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL . 


DM00179 13.97 8.043e- 
09 124-134 


248 


BL00246 


Wnt-1 family proteins. 


BL00246D 23.97 l.OOOe- 
40 186-239 BL00246E 
20.32 1.000e-40 305- 

351 BL0D246B 13.69 
4.176e-36 105-140 

TlT.nnO^ Cl 1C 7C ■> OQC<a_ 
XJJjUU^^IDA ID. 1 Z> Z.ZOOC- 

24 70-90 BL00246C 
15.56 4.857e-22 150- 
175 


250 


PR00927 


ADENINE NUCLEOTIDE 
TRANS LOCATOR 1 SIGNATURE 


PR00927E 14.93 5.114e- 
10 253-275 


254 


BL00674 


AAA-protein family 
proteins . 


BL00674B 4.46 l.OOOe- 
09 223-245 


255 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALtT ZlfJL CAJJMJLU. 


PD01796 15.01 6.045e- 
no ci do 


256 


BL50002 


Src homology 3 {SH3 ) 
domain proteins profile . 


BL500C2B 15.18 2 . 800e- 
in 4*ac 


256 


PR00094 


ADENYLATE KINASE 
S IGNATURE 


PR00094C 12.94 2.200e- 
18 87-104 PR00094D 
12 . 52 2 .731e-14 161- 
177 PR00094A 10.31 
5.500e-14 11-25 
PR00094B 11.01 4.115e- 
xj J5 rttuuyjsfi 
11.25 7.333e-13 178- 
193 


259 


BL00892 


HIT family proteins. 


BL00892A 18.17 5.500e- 
13 60-91 


262 


BL00388 


Proteasome A- type 
subunits proteins . 


BL00388A 23.14 l.OOOe- 

0-3^ OxjUUjBOD 

31.38 3.864e-33 66-108 
BL00388D 20.71 l.OOOe- 
21 153-184 BL00388C 
18.79 8.147e-16 126- 
148 


264 


BL00903 


Cytidine and 
deoxycytidylate 
deaminases zinc-binding 
region s. 


BLO0903 12.93 5.821e- 
09 91-101 


267 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 1.529e- 
09 241-257 


270 


BL00226 


Intermediate filaments 
proteins . 


BL00226D 19.10 l.OOOe- 
37 362-409 BL00226B 
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SEQ ID NO: 


ACCESSION ; 


DESCRIPTION 


RESULTS* 




NO . 












23.86 B.043e-35 196- 








244 BL00226C 13.23 








' i UUUC i, U ^ O 1 J 








BL00226A 12.77 6 . 143e- 








15 96-111 


271 


PD02952 


KINASE TRANSFERASE 


PD02952C 15.76 9.731e- 






CHnT.TNF PEOTFTN 

uHUUi ilCj ri\ul ^1LN 


lfi 2^^.-265. Pn02952B 








ic C7 c C2C»_oQ 215- 








Z Z J» 


272 


PD02 929 


ADHESION GLYCOPROTEIN 


PD02929A 28.27 l.OOOe- 






r Is. CiV- U ROwR J. • 


infi-ifio pr>n9929R 








18.36 8.800e-17 179- 








199 


— — , 

2 /4 


BLiU 1 U 2 / 


Glycosyl hydrolases 


X3liUlU^/i5 lb. 34 J .iobe- 






family 3 9 proteins. 


u y zlJ - 2 bU 


275 j 


PRO 04 24 


ADENOSINE RECEPTOR 


PR00424D 14.32 6.451e- 






SIGNATURE 


11 39-59 


277 


bL'U UU3Z 


KiuOBumai procein o / 








proteins . 


13 137-184 BL0Q052B 








15.17 5.143e-l2 208- 








23 5 


279 


BL00790 


Receptor tyrosine kinase 


BL00790N 13.25 5.659e- 






class V proteins. 


13 267-294 


280 


PR00319 


BETA G- PROTEIN 


PR00319D 11.64 6 . b2be- 






( TRANS DUC IN) SIGNATURE 


23 107-125 PR00319C 
















PR00319A 15.27 B.364e- 
















11.4/ o.zUUe-1? 'U-OD 


2 81 




RTTT^ f2- DDOTP TNI 
DCj Ih O f ftw 1 Ci 11\ 


DPnniiQn i 1 c.a fi?Ct»_ 






(TRANSDUCINJ SIGNATUKh 


23 94-112 PRCUJIjC 








13 41 1 000e-21 76-92 








rKU U J 13K 13 . Z / Q . JQ4e- 








■51 id cc Donnn qb 








11.4/ o.zuue-i:? 


287 


PF0 092 9 


Exonuclease . 


PrOD92SIJ lo . 1 / ( . Jobe- 








09 149-163 


291 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 2.360e- 








09 93-127 


292 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 2.360e- 








09 93-127 


294 


PD00066 


PROTEIN ZINC- FINGER 


PD00066 13.92 8.714e- 






METAL-3INDI . 


12 203-216 


295 


BL00028 


Zinc finger, C2H2 type, 


BL00028 16.-07 5.500e- 






domain proteins. 


15 322-339 BL00028 








16.07 9.471e-14 433- 








450 BL00028 16.07 








4.600e-13 648-665 








BL00028 16.07 5.500e- 








13 760-777 BL00028 








16.07 9.550e-13 788- 








805 BL0002B 16.07 








3.348e-12 704-721 








BL00028 16.07 6.478e- 








12 461-478 BL00028 








16.07 8.435e-12 844- 








OCX uL)\J U U 2 o 1 D . v / 








1.692e-ll 593-610 








BL00028 16.07 2.038e- 








11 211-228 BL00028 








16.07 5.154e-ll 732- 








749 BL00028 16.07 








5.846e-ll 377-394 








BL00028 16.07 6.885e- 








11 816-833 BL00028 








16 .07 7.231e-ll 676- 








693 BL00028 16.07 








9.654e-ll 564-581 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 








BL00028 16.07 4.086e- 








09 517-534 BLO0O28 








16.07 7.429e-09 489- 








506 


296 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 8.333e- 
16 111-136 BL00215A 
15.82 2.723e-ll 10-35 
BL00215B 10.44 9.526e- 
11 152-165 BL00215B 
10.44 7.375e-10 59-72 
BL00215A 15.82 9.824e- 
10 205-230 


302 


PF00953 


Glycosyl transferase. 


PF00953C 19.70 8.773e- 
34 236-269 PF00953A 
19.68 5.000e-25 102- 
129 PF00953B 6.17 
1.000e-13 182-194 


304 


PP00152 


tRNA synthetases class 
II . 


PF00152D 21.30 8.364e- 
28 422-461 PF00152C 






28.03 9.250e-21 220- 
257 PF00152B 15.67 
2.658e-13 159-184 
PF00152A 19.68 5.714e- 
11 44-67 


305 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 8.250e- 
35 37-76 


306 


PD02784 


PROTEIN NUCL.EAR 


PD02784B 26.46 5.840e- 




R I BONUCLEO P ROTE I N . 


09 92-135 


307 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454C 11.24 7.808e- 
09 1167-1186 


308 


PR00237 


RKODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237E 13.03 5.091e- 
13 188-212 PR00237G 
19.53 7.207e-13 268- 
295 PR00237A 11.48 
4.375e-ll 24-49 
PR00237C 15.69 3.057e- 
10 101-124 PR00237D 
8.94 4.750e-10 137-159 
PR00237F 13.57 5.364e- 
10 230-255 PR00237B 
13.50 9.43Be-10 57-79 


309 


BL00522 


DNA polymerase family X 
proteins . 


BL00522C 11.90 7.577e- 
24 315-339 BL00522F 
14.90 1.310e-15 470- 
494 BL00522A 25.52 
1.265e-14 179-226 
BL00522E~19 .63 8.615e- 
14 430-460 BLOODED 
27.30 9.625e-12 267- 
313 


310 


BL00326 


Tropomyosins proteins . 


BL00326D 8.76 5.235e- 
lu o bb - o? / 


312 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290A 20.89 4.706e- 

14 lbl-1'4 !3liUU^-?UD 

13.17 9.000e-12 211- 
229 


313 


3L00345 


Ete-domain proteins. 


BL00345B 21.28 l,000e- 
40 34-85 BL00345A 
13.96 9.217e-16 1-20 


315 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 5.09-le- 
15 63-76 


317 


BL01020 


SARI family proteins. 


BL0102OC 15 .35 3 .I98e- 
17 79-130 


318 


BL00216 


Sugar transport 
oroteins . 


BL00216B 27 .64 4 .696e- 
11 164-214 


320 


PR00109 


TYROSINE KINASE 


PR00109B 12.27 4.814e- 




CATALYTIC DOMAIN 


10 216-235 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 




321 


BL00027 


1 Homeob ox ' doma i n 
proteins . 


BL00027 26.43 5.688e- 
10 329-372 


322 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR0O1O9B 12.27 B.765e- 
12 558-577 


324 


BL01241 


Link domain proteins. 


BL01241 35.81 8.3l3e- 
30 183-236 BL01241 
35.81 3.222e-13 282- 
335 


326 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 4.000e- 
12 515-566 BL00412D 
16.54 5.705e-ll 516- 
567 BLO0412D 16.54 
7.848e-10 518-569 
BL00412D 16.54 1.827e- 
09 514-565 BL00412D 
16.54 1.918e-09 513- 
564 BL00412D 16.54 
2.102e-09 520-571 


328 


BL00232 


Cadherins extracellular 
repeat proteins domain 
proteins . 


BL00232B 32.79 9.557e- 
20 151-199 BL00232B 
32.79 2.246e-18 41-89 
BL0O232B 32.79 5.985e- 
18 370-418 BL00232B 
32.79 5.500e-16 258- 
306 BL00232B 32.79 
9.384e-15 475-523 
BL00232C 10.65 2.537e- 
12 256-274 BL00232C 
10.65 4.326e-ll 368- 
386 BL00232C 10 . 65 
7.26le-ll 473-491 
BL00232C 10.65 7.457e- 
11 39-57 


330 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454C 11.24 7.808e- 
09 1167-1186 


331 


BL00598 


Chromo domain proteins. 


BL00598 14.45 8.393e- 
18 27-49 


333 


BL01016 


Glycop rot ease family- 
proteins . 


BL01016C 22.84 3.925e- 
32 70-115 BL01016E 
14.88 5.286e-19 149- 
177 BL01016H 13.71 
7.577e-13 291-301 
BL01016D 8.86 3.298e- 
11 127-140 BL01016G 
7.14 5.622e-10 261-271 
BL01016A 5.65 7.167e- 
10 4-19 BL01016F 
13 .34 1.563e-09 200- 
212 BL01016B 8.93 
8.855e-09 38-50 


339 


BL01115 


GTP-binding nuclear 
protein ran proteins . 


BL01115A 10.22 5.500e- 
11 17-61 


340 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 1.231e- 
33 10-49 


341 


RT.m icn 

BljUlJ.0 U 


Kineein light chain 
repeat proteins. 


BL01160B 19.54 5.042e- 
09 55-109 


342 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01056 19.43 2.400e- 
30 16-55 


343 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16 . 80 l.OOOe- ! 
40 20-68 


346 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 4.764e- 
11 135-154 


347 


PR00109 


TYROSINE KINASE 


PRC0109B 12 .27 4. 764e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






CATALYTIC DOMAIN 
SIGNATURE 


11 135-154 


351 


BL01187 


Calcium- binding EGF-like 
domain proteins pattern 
proteins . 


BL01187B 12.04 1.783e- 
13 100-116 BL01187B 
12.04 8.435e-13 276- 
292 BL01187B 12.04 
8.800e-ll 13-29 
BL01187B 12.04 7.429e- 
10 54-70 BL01187B 
12.04 5.725e-09 231- 
247 BL011B7A 9,98 
7.000e-09 255-267 


352 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 S.950e- 
10 366-379 PD00078B 
13.14 4.522e-09 168- 
181 


354 


BL00380 


Rhodanese proteins . 


BL00380F 9.76 6.694e- 
11 542-553 


355 


PF00628 


PHD- finger . 


PF00628 15.84 l.OOOe- 
11 116-131 


356 


PR00587 


SOMATOSTATIN RECEPTOR 
TYPE 1 SIGNATURE 


PR00587A 8.06 9.700e- 
09 17-37 


359 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00O66 13.92 4.462e- 
15 261-274 PD00066 
13.92 6.500e-13 233- 
246 PD00066 13.92 
4.300e-09 289-302 


361 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 9.604e- 
13 54-109 PF00791B 
28.49 1.095e-12 21-76 
PF00791A 27.85 1.432e- 
09 71-126 PF00791B 
28.49 7.440e-09 184- 
239 


362 


PF00791 


Domain present in ZO-1 
and Dnc5-like netrin 
receptors . 


PF00791B 28.49 2.273e- 
11 279-334 


363 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 S.080e- 
10 73-95 PR00450C 
12.22 3.27Be-09 109- 
131 


364 


PF00242 


DNA polymerase (viral) 
N- terminal domain 
proteins . 


PF00242Q 13.51 2.328e- 
09 22-68 


365 


PF00242 


DNA polymerase (viral) 
N-terminal domain 
proteins . 


PF00242Q 13.51 2.328e- 
09 22-68 


366 


BL01160 


Kinesin light chain 
repeat proteins . 


BL01160B 19.54 6.644e- 
09 1038-1092 


367 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.360e- 
09 229-243 PR00019B 
11.36 6.040e-09 91-105 
PR00019A 11.19 8.667e- 
09 370-384 


368 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllD 14 . 03 9 .000e- 
15 30-49 PROOOllA 
14.06 9.oJUe-lb JO-49 
PROOOllB 13.08 4.500e- 
14 30-49 PROOOllC 
24.25 5.143e-09 6-35 


369 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032H 11.25 4.150e- 
09 417-430 


372 


BL00478 


LIM domain proteins. 


BL00478B 14.79 7.750e- 
12 410-425 


3 73 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.757e- 
34 26-65 


376 


PR00170 


SODIUM CHANNEL SIGNATURE 


PRO0170E 6.48 2.739e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








10 88-118 


3 80 


RT,n m n *7 

DXjU U1U i 


Protein IcinasPR ATP- 

binding region proteins. 


BL00107A 18.39 1 . OOOe- 
23 276-307 BL00107B 
13.31 1.692e-12 342- 
358 


381 


BL0045S 


Putative AMP-binding 
domain proteins. 


BL00455 13.31 5.714e- 
12 50-66 


3 82 


PRO 06 24 


HI STONE H5 SIGNATURE 


PR00624G 4.08 4.900e- 
09 524-544 


384 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5 . 950e- 
10 366-379 PD00078B 
13.14 4.522e-09 168- 
181 


385 


PR00511 


TEKTIN SIGNATURE 


PR00511D 7.11 5.371e- 
09 67-80 


386 


PD02870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR . 


PD02870B 18.83 6.000e- 
10 97-130 


388 


PD00066 


PROTEIN ZINC- FINGER 
METAL -BIND I . 


PD00066 13.92 S.OOOe- 
13 516-529 


389 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290A 20.89 7.667e- 
09 151-174 


3 90 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 5 . 200e- 
15 221-246 BL00215A 

Id . Hz / ,oioc-H 

BL00215A 15.82 8 . 851e- 

10.44 9.526e-ll 69-82 

09 272-285 BL00215B 
10.44 8.500e-09 165- 
178 


394 


BL00674 


AAA-protein family 


BL00674B 4.46 2.723e- 


397 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 8.579e- 
11 141-155 


398 


PRO 0761 


BIND IN PRECURSOR 
SIGNATURE 


PR00761B 9.93 6.764e- 
09 55-74 


399 


BL00240 


Receptor tyrosine kinase 
class III proteins. 


BL00240B 24.70 7.907e- 
10 118-142 


401 


PF00676 


Dehydrogenase El 
component . 


PF00676B 24.71 8.071e- 
18 331-369 PF00676D 
14.40 J.8b4e-la 48b- 
506 PF00676C 16.88 


402 


BL00514 


Fibrinogen beta and 
gamma chains C- terminal 
domain proteins. 


BL00514C 17.41 4.673e- 
28 4432-4469 BL00514G 
15.98 6.092e-14 4555- 
4585 BL00514D 15.35 
2.532e-12 4473-4486 

RT.nn^14P 11 65 4 7flflf»- 
OLiUUDJ-lf 11. v3 *± . z o o c 

10 4519-4534 BL00514H 
14.95 4.955e-10 4584- 
4609 


403 


PF00992 


Troponin. 


PF00992A 16.67 5.974e- 
09 105-140 


4 04 




LEUCTNE-RITH PFPFAT 

SIGNATURE 


PR00019R 11 36 1 450e- 
10 73-87 PR00019A 
11.19 8.043e-10 76-90 
PR00019B 11.36 l.OOOe- 
09 50-64 PR00019B 
11.36 1.000e-09 96-110 


405 


BL00232 


Cadherins extracellular 
repeat proteins domain 
proteins . 


BL00232B 32.79 9.5.57e- 
20 139-187 BL00232B 
32.79 2.246e-18 29-77 
BL00232B 32.79 5.9B5e- 
18 358-406 BL00232B 
32.79 5.500e-16 246- 
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NO - 


DESCRIPTION 


RESULTS* 








294 BL00232B 32.79 
q 704p„ic 461-i;i1 

^.jD^C Xj ** O J Jll 

BL00232C 10.65 2.537e- 
12 244-262 BL00232C 
10.65 4.326e-ll 356- 
374 BL00232C 10.65 
7.261e-ll 461-479 
BL00232C 10.65 7,457e- 
11 27-45 


407 


PF00426 


Outer Capsid protein VP4 
(Hemagglutinin) . 


PF00426S 15.67 5.634e- 
09 902-940 


409 


BLO1160 


Kinesin light chain 
repeat proteins . 


BL01160B 19.54 9.695e- 
09 126- 180 


410 


BL00741 


Guanine -nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 2.731e- 
aq 51:5.971: 


411 


PF0C646 


F-box domain proteins. 


PF00646A 14.37 6.344e- 
09 86-100 


412 


BL00603 


Thymidine kinase 
cellular-type proteins. 


BL00603B 11.39 8.500e- 
09 542-557 


415 


BL00866 


Carbamoyl -phosphate 
synthase subdomain 
proteins . 


BL00866B 36.29 3.571e- 
31 245-291 EL00866C 
23 . 26 9 .OO0e-25 331- 
366 


418 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 6.114e- 
09 590-602 ] 


421 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 7.955e- 
14 23-78 PF00791B 
28.49 3 .653e-12 273- 
328 PF00791B 28.49 
4.273e-ll 156-211 
PF00791B 28.49 7.818e- 
11 89-144 PF00791B 
28.49 1.524e-10 56-111 
PF00791C 20.98 3.559e- 
09 37-76 PF00791C 
20.98 5.235e-09 170- 
209 PrOOVSlt. 20.98 
5.235e-09 381-420 

09 189-244 PF00791B 
28.49 7.028e-09 435- 
490 PF00791B 28.49 
8.679e-09 357-422 


424 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 7.207e- 
28 1645-1679 


425 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 

O XoIMA 1 UKb 


10 228-251 


429 


BL00518 


Zinc finger, C3HC4 type 
^KH\vj linger/ , proteino. 


BL00518 12.23 4.600e- 
11 31-40 


431 


BL00039 


DEAD -box subfamily ATP- 
dependent helicases 
prote ins . 


BL00039D 21.67 l.B44e- 
34 490-536 BL00039A 
Ifl 44 5 Gl£e-19 205- 
244 BL00039B 19.19 
8.920e-16 251-277 
BL00039C 15.63 5.781e- 
15 333-357 


432 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 7.652e- 
12 169-185 


433 


PR00828 


FORMIN SIGNATURE 


PR00828B 5.23 8.218e- 
10 382-405 


436 


BL00415 


Synapsins proteins. 


BL00415N 4.29 8.643e- 
11 195-239 BL00415N 
4 .29 3 ,036e-09 809-853 


443 


PR00834 


HTRA/DEGQ PROTEASE 
FAMILY SIGNATURE 


PR00834F 10.91 6.040e- 
11 221-234 


446 


PF01140 


Matrix protein (MA), 


PF01140D 15.54 9.663e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






P 15. 


10 183-218 PF01140D 
15.54 3.093e-09 246- 
281 


449 


PR00568 1 


DOPAMINE D3 RECEPTOR 
SIGNATURE 


PR0056BG 13.95 5.551e- 
09 39-53 


451 


PF00084 


Sushi domain proteins 
(SCR repeat proteins. 


PF00084B 9.45 3.813e- 
10 47-59 


452 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.821e- 
09 618-649 


456 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14 .18 l.OOOe- 
25 77-99 PR00380D 

PR00380C 13.18 8.286e- 
17 2^0-249 pynn^flOR 
12.64 4.724e-16 194- 
212 


457 


PR00253 


GAMMA-AMINOBUTYRIC ACID 
SIGNATURE 


PR00253A 9. IS 9.143e- 
24 246-267 PRn02^TR 
13.47 2.000e-23 272- 
294 PR00253C 13.85 
7.000e-23 306-328 
PR00253D 16.68 5.950e- 
21 452-473 


467 


PR00849 


GLYCOSYL HYDROLASE 
FAMILY 58 SIGNATURE 


PR00849D 9.77 9.236e- 
09 910-937 


471 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins . 


BL00678 9.67 8.200e-12 
33-44 


472 


BL00226 


Intermediate filaments 
proteins . 


BL00226B 23.86 3.721e- 
09 282-330 


473 


BL00344 


GATA- type zinc finger 
domain proteins. 


BL00344 17.99 7.000e- 
12 814-852 


474 


RL004 an 


Thi nl -art ivat-pd 
cytolysins proteins. 


BT.004R1F 13 07 B 909p- 
09 173-199 


479 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 2.57le- 
09 393-408 


480 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 

RTTmTNf? NTT 


PD01066 19.43 1.900e- 
38 8-47 


481 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405C 19.41 l.OOOe- 
19 451-473 PR00405B 
11.83 4.333e-18 430- 
448 PR00405A 17.71 
4.971e-18 411-431 


482 


PR00049 


WILM'S TUMOUR PROTEIN 

Q T r21\IlV TT TO TT 


PR00049D 0.00 9.286e- 

0.00 9.857e-10 958-973 
PR00O49D 0.00 1.305e- 
09 937-952 PR00049D 
0.00 8.322e-09 939-954 


486 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007B 14.16 8.6l5e- 
23 653-673 PR00007A 
19.33 6.192e-22 626- 
653 PR00007C 15.60 
5.846e-19 698-720 
PR00007D 9.64 3.647e- 
13 732-743 


487 


PD00567 


PROTEIN RNA- BINDING RNA 
REPEAT HYD. 


PD00567B 18.23 2.853e~ 
09 200-214 


468 


PR0O988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4 .569e- 
12 3-21 


489 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 4.882e- 
27 30-69 PD01066 
19.43 3.430e-10 71-110 


490 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.864e- 
09 663-678 


492 


BL01128 


Shikimate kinase 
proteins . 


BL01128A 18.84 6.464e- 
17 58-92 


497 


PF00429 


ENV polyprotein (coat 


PF00429 31.08 7.171e- 
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NO. 


DESCRIPTION 
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polyprotein) . 


15 21-71 


498 


BL00120 


Lipases, serine 
proteins. 


BL00120B 11.37 7.923e- 
09 185-200 


500 


BLO0030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 7.353e- 
11 299-318 


501 


BL01159 


WW/rsp5/WWP domain 
proteins . 


BL01159 13.85 8.579e- 
12 131-146 


505 


BL00021 


Kringle domain proteins. 


BL00021B 13.33 3.739e- 
17 492-510 


508 


PR00120 


H+TRANS PORTING AT PAS E 
{ PROTON PUMP) SIGNATURE 


PR00120C 9.90 5.800e- 
19 705-722 


509 


DM01417 fc 


6 kw INDUCING XPMC2 
MUSHROOM SPAC22G7.04. 


DM01417E 20.62 2.938e- 
16 362-395 DM01417D 
11.08 3.800e-13 322- 
338 


510 


PF00534 


Glycosyl transferases 
group 1 . 


PF00534B 14.47 6.625e- 
09 346-370 


511 


PF00534 


Glycosyl transferases 
group 1 . 


PF00534B 14 .47 6.62Se- 
09 293-317 


512 


PF00534 


Glycosyl transferases 
group 1 . 


PF00534B 14.47 6.625e- 
09 366-390 


513 


PD01841 


PHOSPHORYLASE KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe- 
40 110-160 PD01841B 
14.35 l.OOOe-40 181- 
222 PD01841D 17.87 
1.000e-40 243-295 
PD01841F 13.36 l.OOOe- 
40 333-382 PD01B41G 
24.26 l.OOOe-40 386- 
440 PD01841L 18.42 
l.OOOe-40 968-1010 
PD01841I 23.00 4.545e- 
37 762-804 PD01841E 
18.60 3.750e-36 295- 
333 PD01841J 14.94 
6.023e-35 851-888 
PD01841H 21.30 2 . 909e- 
33 490-527 PD01841K 
14.81 7.088e-33 924- 
954 PD01841C 13.78 
9.386e-23 222-243 
PD01841M 10.82 8.594e- 
21 1054-1073 PD01B41I 
23.00 2.667e-13 549- 
591 


514 


PR00153 


CYCLOPHILIN PEPTIDYL- 
PROLYL CIS -TRANS 
ISOMERASE SIGNATURE 


PR00153C 11.01 7.188e- 
13 95-111 PR00153E 
9.10 4.150e-12 122-138 


515 


BL00740 


MAM domain proteins. 


BL00740A 13.87 7.188e- 
12 410-423 


516 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 6.087e- 
12 1018-1052 


517 


BL00242 


Integrins alpha chain 
proteins. 


BL00242C 16.86 8.320e- 
09 12-42 


523 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 3.750e- 
39 20-68 DM00031B 
15.41 1.000e-25 84-118 


525 


BL00319 


Amyloidogenic 
glycoprotein 
extracellular domain 
proteins . 


BL00319C 17.12 8.375e- 
10 61-95 


526 


PF00789 


Domain present in 
ubiqui tin -regulatory 
proteins. 


PF00789B 19.70 3.308e- 
12 322-343 PF00789C 
20.98 5.269e-09 367- 
392 


528 


BL01162 


Ouinone oxidoreductase / 
zeta-crystallin 
proteins . 


BL01162C 22.80 l.SOOe- 
16 120-164 
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529 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 3.893e- 
09 60-73 


532 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 4 . OOOe- 
17 11-36 BL00215A 
15.82 8.660e-ll 123- 
148 


533 


BL00215 


Mitochondrial energy 
transfer proteins . 


BL00215A 15.82 4.000e- 

1 ' 11-Jfo dIjU0215A 

15 .82 8.660e-ll 97-122 


534 


BL0009B 


Thiolases acyl-enzyme 
intermediate proteins. 


BL00098C 21.65 2.800e- 
38 181-227 3L00098B 
32.59 5.345e-38 86-141 
BL00098D 26.30 8.364e- 
35 245-288 BL00098E 
22 . 12 1.000e-34 314- 
352 BL00098F 10.18 
4.971e-22 365-386 
BL00098A 10.60 6.455e- 
11 38-50 


535 


PR00370 


FLAVIN- CONTAINING 
MONOOXYGENASE (FMO) 
SIGNATURE 


PR00370E 11.96 7.429e- 
22 321-340 PR00370D 
16.33 6.143e-21 185- 
204 PR00370F 17.75 
6.559e-21 376-396 
PR00370B 10.91 9.591e- 
21 27-46 PRO037OC 
12.72 3.5O0e-20 140- 
157 PR00370A 3.35 
6.442e-17 4-20 


536 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL0002B 16.07 7.429e- 
16 285-302 BL00028 
16.07 6.294e-14 341- 
358 BL00028 16.07 
1.346e-ll 369-386 
BL00028 16.07 1.692e- 

16.07 4.462e-ll 453- 

7.231e-ll 425-442 
BL00028 16.07 4.300e- 
10 313-330 


53 7 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 844-881 


538 


BL00762 


WHEP-TRS domain 
pxuLcins . 


BL00762A 23.43 9.419e- 
Id oli» - o ab 


539 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 822-859 


54 0 


PR00985 


LEUCYL-TRNA SYNTHETASE 
SIGNATURE 


PR00985A 12.10 9.000e- 
10 357-375 


541 


PD02102 


SUBUNIT E V-ATPASE 
VACUOLAR ATP SYNTHASE 

ri X JJKUIj . 


PD02102A 16*. 74 l.OOOe- 
40 3-47 PD02102B 
18,28 4.375e-34 57-100 
PD02102D 21.69 1.923e- 
30 179-218 PD02102C 
26.34 8.929e-26 100- 

1 A C 
1 4 O 


543 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 l.OOOe- 
10 48-65 BL00028 
16.07 6.400e-10 193- 
210 BL00028 16.07 
1.000e-09 343-360 
BL00028 16.07 6.914e- 
09 78-95 


545 


BL00250 


TGF-beta family 
proteins. 


BL00250A 21.24 8.000e- 
31 293-329 BL00250B 
27.37 5.286e-24 354- 
390 


547 


PR00319 


BETA G- PROTEIN 


PR00319B 11.47 2.714e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






/ IT) T\ YT O T^TY T YY \ O T YT TV fTTT TT"> 

ITRANSDUCIN} SIGNATURE 


□ 9 loo-JOl FRUOJIjA 

15.27 7.344e-09 210- 

•5 0 1 


548 


BL01204 


NF- kappa -B/Rel /dorsal 
domain proteins. 


3L01204A 17.74 1 . OOOe- 
40 8-56 BL01204D 
16.42 1.000e-40 177- 
">oi BT.ni onif 1 ? Q 1 * 

4^1 sLtUXZUriC* X j 1 OJ 

7.652e-30 225-250 
BL01204C 13.93 8.714e- 
22 141-160 BL01204B 

1C /I A llla.lC 1 m _ 

ID . 41 S .3Jje-16 xuz- 

116 


—£a q 

549 


FROUi Jo 


VjIlrX/Uilvj Vji f-OJLJNUXWvj 

PROTEIN FAMILY SIGNATURE 


friuujzoA 0. /d 0. jose- 
15 255-276 


551 


PF00632 


KECT-domain (ubiquitin- 
transferase) . 


PrO06j2C 2 0.66 j.JUze- 
23 1569-1601 PF00632B 
18.45 3 .700e-21 1515- 

1 CLA "3 
i. 


554 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins . 


BL00290B 13.17 1.600e- 
14 187-205 BL00290A 
^0.89 2.0D9e-14 1J0- 

153 


557 


DM00215 


PROLINE-RILH PKUIhlN J. 


nunnn c 1 □ ai c 1 ioo 

umuuzId iy.«*j o . j jye- 
09 846-879 


559 


DM01111 


4 kw PHOSPHATASE 
TRANSFORMING 61K PDF1 . 


DM01111L 11.93 3.762e- 
09 7-35 


562 


PF00658 


Poly-adenylate binding 
protein, unique domain 
proteins . 


PF00658C 16.33 9.455e- 
32 118-155 


564 


BL00141 


Eukaryotic and viral 
aspartyl proteases 
proteins . 


BL00141A 12.10 4 . 150e- 
10 472-488 


566 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.667e- 
15 272-289 


567 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.977e- 
13 229-268 


569 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107A 18.39 7.000e- 
19 118-149 BL00107B 
13.31 5.500e-15 183- 
199 


570 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107A 18.39 7.000e- 
19 118-149 BL00107B 
13.31 5.500e-15 183- 
199 


572 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1.857e- 
34 454-483 PR00193C 
12.60 2.636e-31 223- 
251 PR00193B 11.69 
7.750e-29 171-197 
PR00193A 15.41 2.588e- 
22 115-135 PRO 01 9 3E 
19.47 6.559e-19 508- 

3 J / 


573 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNAIUKll 


PR00193D 14.36 1.857e- 
i a a i r\ Ada rjon/i n q i r 1 

267 PR00193B 11.69 
7.750e-29 171-197 
PR00193A 15.41 2.5B8e- 
22 115-135 PR00193E 
19.47 6.559e-19 524- 
553 


575 


BL00752 


XPA protein. 


BL00752B 19.17 9.703e- 
10 885-929 


576 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 7.000e- 
09 276-295 


577 


BL00116 


DNA polymerase family B 


BL00116A 12.81 5.737e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins . 


13 864-877 BL00116B 
11.82 1.529e-12 952- 
965 


578 


BL00195 


Glutaredoxin proteins. 


BL00195B 15.31 7.158e- 
09 121-141 


579 


PR00019 ; 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 9.000e- 
11 217-231 PR00019B 
11.36 1.360e-09 386- 
400 PR00019A 11.19 
3.333e-09 389-403 
PR00019B 11.36 8.920e- 
09 363-377 


580 


PR00253 


GAMMA- AMINOBUTYRIC ACID 
(GABA) RECEPTOR 
SIGNATURE 


PR00253A 9.15 2.125e- 
25 275-296 PR00253B 
13.47 7.923e-24 301- 
323 PR00253D 16.68 
S.846e-23 444-465 
PR00253C 13.85 2.241e- 
20 335-357 


583 


PR00343 


SELECTIN SUPERFAMILY 
COMPLEMENT - B IND ING 
REPEAT SIGNATURE 


PR00343C 16.85 2.286e- 
11 1233-1252 PR00343C 
16.85 5.500e-ll 333- 
352 PR00343C 16.85 
5.500e-ll 783-802 
PR00343C 16.85 4 . 246e- 
10 1491-1510 PR00343C 
16.85 8.230e-10 1686- 
1705 


584 


DM01537 


kw SKI2W SKI2 NUCLEOLAR 
HELICASE. 


DM01537B 21.63 1.878e- 
37 79-126 DM01537B 
21.63 9.491e-30 916- 
963 DM01537A 15.14 
J.186e-ll 784-804 


586 


PF00013 


KH domain proteins 
family of RNA binding 
proteins. 


PF00013 5.78 1.450e-09 
124-136 


587 


DM0 0 o 92 


3 RE 1 KUVIKAJj FKU 1 tlXJAaCi . 


13 262-296 


589 


BL004 7 8 


LIM domain proteins . 


BL00478B 14.79 1.643e- 
13 261-276 BL00478B 
14.79 7.709e-09 321- 
336 


5 90 


PF00855 


PWWP domain proteins . 


PF00855 13 .75 8.000e- 
15 931-948 


591 


PF00855 


PWWP domain proteins. 


PF00855 13.75 8.000e- 
15 1062-1079 


593 


PF00628 


PHD- finger. 


PF00628 15.84 3.455e- 
12 424-439 


594 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 2.241e- 
16 558-576 PR0020SA 
14.73 9.308C-13 542- 
558 PR00205C 13.65 
5.304e-12 594-609 
PRO0205B 11.39 4.273e- 
10 336-354 


596 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107A 18.39 4.789e- 

18 jUV-JJo 


598 


PD01675 


GLYCOPROTEIN MAJOR 
ENVELOPE PROBABLE U3 . 


PD01675C 19.89 2.330e- 
10 55-89 


600 


BL00242 


Integrins alpha chain 
proteins . 


BL00242E 9.03 9.591e- 
27 985-1014 BL00242C 
16.86 4.115e-26 286- 
316 BL00242D 13.57 
4.150e-25 357-3B2 
BL00242B 8.13 7.353e- 
12 189-199 BL00242D 
13,57 3.455e-ll 421- 
446 BL00242A 13.80 
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qyn t r> NO • 


NO. 


DESCRIPTION 


RESULTS* 








5.000e-ll 61-73 
BL00242D 13.57 4.986e- 
10 291-316 


601 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320A 16.74 5.6l0e- 
09 198-213 


602 


PRO0278 


PANCREATIC HORMONE 
SIGNATURE 


PR00278A 12.43 4.569e- 
10 331-348 


603 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479C 12.01 3.250e- 
12 170-183 


604 


BL00315 


Dehydrins proteins. \ 


BL00315A 9.35 1.672e- 
09 424-452 


605 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.794e- 
10 295-339 


606 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 l.OOOe- 
13 335-358 


608 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.167e- 
15 265-282 


609 


PF00855 


PWWP domain proteins . 


PF00855 13.75 5.167e- 
15 211-228 


612 


DM01206 


CORONAVIRUS NUCLEOCAPSID . 
PROTEIN . 


DM01206B 10.69 7.411e- 
10 877-897 DM01206B 
10.69 8.027e-10 861- i 

9.137e-10 873-893 
DM01206B 10:69 1.456e- 
09 859-879 DM01206B 
10.69 1.797e-09 879- 

4,076e-09 865-885 
DM01206B 10.69 7.D38e- 
09 898-918 DM01206B 
10.69 7.949e-09 871- 
891 DM01206B 10.69 
8.291e-09 767-787 


615 


PD02699 


BINDING DNA. 


Dnn9fiPQa r qi ? n"?**^- 

tr UU jC D j JJn. O . j JL a.UajC 

28 129-158 PD02699C 
24.84 1.000e-27 317- 
364 PD02699B 18.28 
1.000e-17 158-182 


616 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 4.086e- 
22 288-310 PR0038OD 
cj Q-a T 771P-17 486-508 
PR00380B 12.64 2.241e- 
16 410-428 PR00380C 
13.18 2.976e-13 436- 
455 * 


617 


PROO380 


KINESIN HEAVY CHAIN 


PR00380A 14.18 4 . 086e- 
22 288-310 PR00380D 
9.93 3.721e-17 486-508 
PR00380B 12.64 2.241e- 
16 410-428 PR00380C 
13.18 2.976e-13 436- 
455 






CORONAVIRUS NUCLEOCAPSID 
PROTEIN . 


DM01206B 10.69 5.143e- 
12 531-551 DM01206B 
10.69 2.6*03e-10 535- 
555 


621 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700B 16 .80 3 .160e- 
21 561-582 


622 


BL00239 


Receptor tyrosine kinase 
class II proteins. 


BL00239F 2B.15 3.222e- 
10 647-692 BL00239C 
18.75 8.304e-10 543- 
566 


623 


PR00407 


EUKARYOTIC MOLYBDOPTERIN 
DOMAIN SIGNATURE 


PR00407K 9.94 8.448e- 
09 326-339 


624 


BL00641 


Respiratory-chain NADH 
dehydrogenase 75 Kd 


BL00641C 21.10 l.OOOe- 
40 157-202 BL00641E 
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SEQ ID NO: 


ACCESSION 

NO. 


DESCRIPTION 


RESULTS* 






subunit proteins . 


24.37 1.000e-40 255- 
308 BL00641F 33.12 
l.OOOe-40 571-623 
BL00641A 17.15 1 . 818e- 
37 48-80 BL00641B | 
12.62 5.846e-34 113- 
139 BL00641D 13.23 
9.308e-29 216-240 


627 


PR00103 


CAMP- DEPENDENT PROTEIN 
KINASE SIGNATURE 


PR00103E 17.80 2 . 500e- 
18 367-380 PR00103B 
13.39 2.0B0e-14 297- 
312 PR00103A 9.59 
2.957e-14 282-297 
PKOUlUJiJ lU.oj 3.0776- 
12 346-358 PR00103C 
15.68 1.000e-ll 334- 
344 PR00103B 13.39 
1.450e-ll 175-190 
PR00103A 9.59 1.720e- 
10 160-175 


630 


PR00081 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081A 10.53 6.211e- 
16 4-22 


631 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 8.500e- 
14 37-50 


632 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10 . 69 2 .233e- 
10 1324-1344 DM01206B 
10.69 4.822e-10 1276- j 
1296 DMC1206B 10.69 j 
7.658e-10 1328-1348 
DM01206B 10.69 8.274e- 
10 1280-1300 DM01206B 
10.69 4.532e-09 1320- 

7.266e-09 1326-1346 ! 


635 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BLi00i07A lo.jy v.oooe- 
23 145-176 BL00107B 
13.31 2.636e-13 211- 
227 


636 


BL00657 


Fork head domain 
proteins . 


obUObb/A ly.jy i.b4be- 
30 101-143 BL00657B 
22.27 7.750e-26 149- 
192 


637 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 l.OOOe- 

1 A c n *7 _ 

iu o u / — qzj 


643 


BL00018 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 4.913e-09 
199-212 1 


647 


PF00628 


PHD- finger . 


PF00628 15.84 2.350e- 

1J JOD"4UU rr UUDao 

15 . 84 3 .455e-12 464- 
479 


64B 


BL01129 


Hypothetical 
yabO/yceC/sfhB family 
proteins . 


BL01129E 13 .25 4.000e- 
25 332-357 BL01129C 
25.56 8.200e-23 236- 
279 BL01129B 12.51 
6.118e-13 191-212 


64 9 


Rr.m 9 o a 

OUU ± i. J. o 


Wvnnrhpriral cnf f ami 1 v 
proteins . 


BL01228D 17.44 3.908e- 
10 455-480 


650 


BL00027 


'Homeobox' domain 
proteins . 


BL00027 26.43 6.684e- 
13 771-814 


651 


BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


BL50002A 14.19 1.750e- 
12 1026-1045 


653 


PR00253 


GAMMA-AMINOBUTYRIC ACID 
(GABA) RECEPTOR 
SIGNATURE 


PR00253A 9.15 4.000e- 
24 253-274 PR00253C 
13.85 8.800e-24 313- 
335 PR00253B 13.47 
3.143e-22 279-301 
PR00253D 16.68 7.652e- 
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SEQ ID NO: 


ACCESSION 
NO . 


DESCRIPTION 


RESULTS* 








20 422-443 


654 


PD01719 


PRECURSOR GLYCOPROTEIN 
SIGNAL RE. 


PD01719A 12.89 4.452e- 

11 969-937 PD01719A 

12.89 3.96le-10 128- 
156 PD01719A 12.89 
7.39Se-10 1276-1304 
PD01719A 12.89 1.222e- 
09 1220-1248 


657 


BLO0354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(AhooJc) . 


BL00354C 6.61 8.397e- 
09 563-578 


658 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 8.397e- 
09 580-595 


659 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2 . 174e- 
13 539-572 DM00215 
19.43 4.750e-12 549- 

582 DM00215 19.43 
9.824e-ll 551-584 
DM00215 19.43 2.929e- 
10 548-581 DM00215 
19.43 4.054e-10 550- 

583 DM00215 19.43 
5.339e-10 552-585 
DM00215 19.43 7 , 10/e- 
10 544-577 


660 


PR00688 


XYLOSE ISOMERASE 
SIGNATURE 


PR00688I 13.78 9.518e- 
09 224-236 


661 


BL00027 


1 Homeobox ' domain 
proteins. 


BLOOU^ / b.yboe- 
23 249-292 


662 


PR003£0 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 


663 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 


664 


PRO0360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 


666 


PROC819 


CBXX/CFQX SUPER FAMILY 
SIGNATURE 


PR00819B 10.83 8.988e- 
10 704-720 


667 


BL50040 


Elongation factor 1 
gamma chain profile. 


BL50040C 22.62 2.143e- 
16 135-178 


668 


PRO0019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.360e- 
09 139-153 PR00019A 
11.19 1.667e-09 94-108 
PR00019B 11.36 4.600e- 
09 163-177 


670 


BLO0018 


EF-hand calcium-binding 
domain proteins . 


BL00018 7.41 3.250e-10 
681-694 BL00018 7.41 
6.400e-10 717-730 


672 


PD00131 


ATP -BINDING TRANSPORT 
TRANS MEMBR. 


PD00131B 34.97 l.OOOe- 
34 356-410 PD00131C 
19.59 1.346e-26 504- 
542 


673 


PR00667 


RETINAL PIGMENT 
EPITHELIUM-RETINAL GPCR 
SIGNATURE 


PR00667G 15.33 7.557e- 
10 106-123 


674 


PR00320 


G-PROTEIN BETA WD-4C 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 593-608 PR00320B 
12.19 4.115e-12 635- 
65C PR00320C 13.01 
8.435e-ll 717-732 
PR00320C 13.01 2.800e- 
10 635-650 PR00320C 
13.01 6.400e-10 593- 
608 PR00320B 12.19 
3.250e-09 593-608 


675 


PR00320 


G-PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 572-587 PR00320B 
12.19 4.115e-12 614- 
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SEQ ID NO: " 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








629 PR00320C 13.01 
8.435e-ll 696-711 
PR00320C 13.01 2.800e- 
10 614-629 PR00320C 
13 .01 6.400e-10 572- 
587 PR00320B 12.19 
3.250e-09 572-587 


676 


PR00019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR00019A 11.19 9.667e- 
09 249-263 


679 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 3 .700e- 
16 225-236 PF00642 
11.59 7.900e-12 187- 
198 


680 


PRO 0 3 08 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3.83 8.754e- 
10 286-296 


681 


BL00019 


Actinin-type actin- 
binding domain proteins. 


BL00019D 15.33 4.206e- 
19 227-257 


682 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 4 . OOOe- 
09 99-118 


687 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.500e- 
10 538-553 


689 


BL01024 


Protein phosphatase 2A 
regulatory subunit PR55 
proteins . 


BL01024A 10.26 l.OOOe- 
40 22-69 Br^om^A 
8.91 l.OOOe-40 86-127 
BL01024C 7.80 l.OOOe- 
40 146-185 BL01024D 
13.22 1.000e-40 185- 
222 BL01024E 11. 9S 

I. OOOe-40 222-266 
BL01024F 9.42 l.OOOe- 
40 266-317 BL01024G 

II. 09 l.OOOe-40 317- 
349 BL01024H 13.88 
l.OOOe-40 389-442 


691 


BL00027 


' Home obox 1 doma i n 
proteins . 


BL00027 26.43 8.071e- 
31 152-195 


692 


BL00211 


ABC transporters family 
proteins. 


BL00211A 12.23 5.050e- 
09 45-57 


693 


BL00211 


ABC transporters family 
proteins . 


BL00211A 12.23 5.050e- 
09 45-57 


694 


BL00211 


ABC transporters family 
proteins . 


BL00211A 12.23 5.050e- 
09 58-70 


696 


BL00680 


Methionine 

aminopeptidase subfamily 
1 proteins. 


BL00680 14.37 5.304e- 
17 173-195 


697 


BL00741 


Guanine -nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 3.41Be- 
11 242-265 


698 


DM01930 


2 kw FINGER SMCX SMCY 
YDR096W. 


DM01930E 15.41 1.367e- 
37 170-215 DM01930F 
14.16 8.232e-28 267- 
303 DM01930B 19.86 
9.163e-10 37-71 


700 


PR00869 


DNA- POLYMERASE FAMILY X 
SIGNATURE 


PR00869A 12.80 1.281e- 
16 245-263 


701 


PR0O048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 2.174e- 
10 77-91 PR00048A 
10.52 6.870e-10 133- 
147 PR00048A 10.52 
8.826e-10 105-119 
PR00048A 10.52 5.320e- 
09 161-175 


702 


BL00523 


Sulfatases proteins. 


BL00523E 19.27 2.565e- 
25 326-356 BL00523A 
13.36 5.050e-16 38-55 
BL00523B 8.64 5.909e- 
15 86-98 BL00523C 
12.64 5.500e-13 137- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








148 BL00523D 9.89 
l.B44e-ll 290-302 
BLO0523G 9.46 5.500e- 
10 513-523 BL00523F 
10.85 6.351e-09 413- 
424 


703 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 8.412e- 
12 376-390 PR00O48B 
6.02 1.000e-10 334-344 
PR00O48B 6.02 1.474e- 
09 364-374 


707 


PD00787 


SYNTHASE BIOSYNTHESIS 
TRANSFERASE. 


PD00787A 14.84 8.941e- 
14 66-82 


708 


PR00761 


BINDIN PRECURSOR 
SIGNATURE 


PR00761E 14.32 8.500e- 
10 822-841 


712 


DM013 54 


kw TRANSCRIPTASE REVERSE 
II 0RF2. 


DM01354Y 10.69 4.977e- 
38 425-465 DM01354X 
13.86 7.300e-34 376- 
415 DM01354V 12 . 97 
4 .923e-17 311-358 
DM01354W 12.64 5.596e- 
10 356-376 


713 


BL00039 


DEAD -box subfamily ATP- 
dependent helicases 
proteins . 


BL00039D 21.67 7.545e- 
27 450-496 BLO0O39A 
18.44 2.537e-18 147- 
186 BL00039C 15.63 
2.2l6e-14 280-304 
BL00039B 19.19 1.947e- 
13 194-220 


715 


BL00383 


Tyrosine specific 
protein phosphatases 
proteins . 


BL00383E 10.35 4.981e- 
10 150-161 


717 


PF00777 


Sialyl transferase 
family. 


PF00777C 18.60 4.035e- 
21 106-161 


718 


DM00031 


IMMUNOGLOBULIN V REGION . 


DMO0O31A 16.80 3.750e- 
39 20-68 DM00031B 
15.41 2.688e-28 84-118 
DM00031C 12.79 1.300e- 
12 131-142 


719 


BL0 024 3 


Integrins beta chain 
cysteine-rich domain 
proteins . 


BL00243B 17.54 l.OOOe- 
40 131-172 BL00243C 
16 .42 l.OOOe-40 172- 
208 BL00243D 24.07 
l.OOOe-40 222-274 
BL00243F 22.63 l.OOOe- 
40 314-358 BL00243I 
31.77 6.571e-39 607- 
650 BL00243E 16.70 
3.077e-35 274-304 . 
BL00243G 21.38 3.625e- 
34 358-400 BL00243H 
17.53 5.235e-29 567- 
593 BL00243A 17.61 
3.250e-21 63-84 
BL00243H 17.53 7.167e- 
16 477-503 BL00243H 
17.53 2.304e-ll 524- 
550 BL00243H 17.53 
5.304e-ll 606-632 
BL00243I 31.77 1.380e- 
09 610-653 


720 


PR00217 


4 3 KD POSTSYNAPTIC 
PROTEIN SIGNATURE 


PR00217C 10.91 B.022e- 
09 20-36 


722 


PR00704 


CALPAIN CYSTEINE 
PROTEASE (C2) FAMILY 
SIGNATURE 


PRO07O4D 11.05 5.909e- 
34 135-161 PR00704F 
13.61 7.000e-26 190- 
218 PR007Q4E 12.55 
8.071e-26 165-189 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


"RESULTS* 








PR00704B 17.94 2.241e- 
23 75-98 PR00704A 
14.68 4.094e-19 30-54 
PR00704C 11.88 1.871e- 
18 99-116 


725 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.652e- 
09 169-187 


726 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.652e- 
09 169-187 


727 


PR00320 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320C 13.01 2.125e- 
13 277-292 PRO032OA 
16.74 1.310e-ll 277- 
292 PR00320C 13.01 
4 .522e-ll 323-338 
PR00320A 16.74 6.586e- 
11 323-338 PR00320B j 
12 .19 4 .343e-10 323- 
338 PR00320B 12.19 
6.914e-10 277-292 


731 


PR00195 


DYNAMIN SIGNATURE 


PR00195A 11.94 8.627e- 
16 288-307 PR00195E 
9.82 3.9l2e-ll 457-474 


733 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 9.082e- 
10 787-798 


738 


BL00039 


DEAD -box subfamily ATP- 
dependent he 1 leases 
proteins . 


BL00039A 18.44 2.565e- 
28 26-65 BL00039D 
21.67 2.105e-20 338- 
384 BL00039C 15.63 
9.100e-13 160-184 
BL00039B 19.19 9.617e- 
11 73-99 


739 


BL01289 


TSC-22 / dip / bun 
family proteins. 


BL01289A 12.18 8.909e- 
31 326-353 BL01289B 
10.45 9.571e-17 353- 
383 


742 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 7.07Be- 
12 41-81 


743 


BL00965 


Phosphomannose isomerase 
type I proteins. 


BL00965C 23.78 l.OOOe- 
40 256-305 BL00965B 
17.77 1.600e-25 126- 
153 BL00965A 10.57 
6.400e-19 94-113 


747 


BL00021 


Kringle domain proteins. 


BL00021D 24.56 4.563e- 
2b 231-273 B1jO0021B 
13.33 5.345e-21 60-78 


74 B 




Osteonectin domain 
proteins . 


DJbUOblZtJ 11.3b 2.U34e- 

11 93-126 


"7 A Q 




SIGNATURE 


lrR004bOC 12.22 o.oBOe- 
10 135-157 


752 




xiivuiucrin pxcjuexxio • 


rt rtftTQcr 17 nc c nnft^ 
djjuu / x i . uo o.uuue- 

11 384-429 BL00795C 

J. / • w D J7.^TtTC iX J f \J 

415 


754 


BL00051 


AiiJUDUUIClX piUUClii 7C 

proteins. 


16 4-50 


755 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III . 


DM01970B 8.60 7.723e- 
09 171-184 


760 


BL01020 


SARI family proteins . 


BL01020C 15.35 9.020e- 
12 99-150 


762 


BL00046 


Histone H2A proteins. 


BL0004S 12.95 l.OOOe- 
40 33-88 


7€!3 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR . 


PD02411 21.89 9.l37e- 
10 206-240 


764 


BL00027 


' Homeobox ' domain 
proteins . 


BL00027 26.43 8.800e- 
29 417-460 


767 


BL01208 


VWFC domain proteins . 


BL01208B 15.83 6.063e- 
10 309-324 BL01208B 
15.83 8.031e-10 165- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








180 BL01208B 15.83 
4.162e-09 85-100 


770 


BL0O031 

UUv v W J X 


Nurlpar hr>T*TT>nn*>Q 

receptors DNA-binding 
region proteins. 


BliOOOllA 19 ^ Q R71q_ 
Duv uu J Mn i. J , Jj 3 .3 i ic 

32 -208-241 3L00031B 
22.25 5.500e-27 242- 
274 


772 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.450e- 
18 4-26 PR00449E 
13.50 3.520e-14 142- 
165 PR00449C 17.27 
3.032e-13 44-67 
PR00449D 10.79 8.579e- 

14 .34 3 .455e-ll 27-44 


/ / J 




Sulf atases proteins . 


BL>00523fc 19.27 9.333e- 
23 299-329 BL00523A 
U.Jo z.zUUe-13 47-64 
BL00523B 8 .64 2.607e- 

1J i7j.-J.UJ OJjUUd^JIJ 

9.89 7.923e-12 224-236 
BL00523C 12.64 4.512e- 
10 141-152 BL00523F 

in A ^ pnia.i n 

xu • Q j j . tjiiie iu j / j - 

384 


775 


BL0002B 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.686e- 
09 568-585 


776 


BL00028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 7.686e- 
09 621*638 


777 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.686e- 
09 595-612 


778 


BL00030 


Eukaryotic RNA-binding 
region RNP-l proteins. 


BL00030A 14.39 8.412e- 
11 322-341 B1j00030A 
14.39 7.000e-10 220- 
239 


779 


PRO0O79 


GLUCOSE - 6 - PHOSPHATE 
DEHYDROGENASE SIGNATURE 


PR00079B 12.98 2.929e- 
26 193-222 PR00079E 
16.65 4.l50e-23 348- 
375 PR00079C 8.68 
b.Jble-lb z4fc-2t>4 
PRO0O79D 13.51 7.070e- 

XO Z o 4 Z o J. IrKUvlu/yrt 

16.12 6.769e-13 169- 
183 


781 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 9.250e- 
17 10-35 BLO0215A 

i J . OZ O . UUUC" JL D i Z i 

246 BL00215A 15.82 
7.857e-12 108-133 
BL00215B 10.44 9.526e- 
11 168-181 


783 


PD00289 


PROTEIN SH3 DOMAIN 
REPEAT PRESYNA. 


PD00289 9.97 6.276e-09 
159-173 


785 


BT.OOfi 90 


dependent helicases 


mrincQnu i 7 m i nnno 
x3i_iu uo jud ij .jo jl.uuuc — 

12 147-165 BL00690A 

o.o/ a, j/yc iu j.j.^-j.^^1 

BL00690C 7. 51 3 .189e- 

09 218-228 


786 


PRC0449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17.27 8.500e- 
16 50-73 PR00449A 
13.20 5.235e-14 8-30 
PR00449E 13.50 2.853e- 
11 150-173 PR00449D 
10.79 1.545e-09 111- 
125 


788 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 8.767e- 
10 1-21 


790 


BL00915 


Phosphatidyl inositol 3- 
and 4 -kinases proteins. 


BL00915C 22.43 9.182e- 
39 725-764 BL00915B 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 








22.78 5.050e-33 633- 
671 BL00915D 27.02 
1.529e-21 795-831 
BL00915A 10.09 l.OOOe- 
13 395-407 


791 


PR00208 

/ 


GLIADIN AND LMW GLUTEN IN 
SUPERFAMILY SIGNATURE 


PR00208A 12.59 6.294e- 
10 120-138 PR00208A 
12.59 6.294e-10 121- 
139 PR00208A 12.59 
6.294e-10 122-140 
PR00208A 12.59 6.294e- 
10 123-141 PR00208A 
12.59 6.294e-10 124- 
142 PR00208A 12.59 
6.294e-l0 125-143 
PR00208A 12.59 6.294e- 
10 126-144 PR00208A 
12.59 6.294e-10 127- 

6.294e-10 128-146 
PR00208A 12.59 6.294e- 
10 129-147 PR00208A 
12.59 7.411e-09 130- 

7.658e-09 131-149 
PR00208A 12.59 7.904e- 
09 132-150 PR00208A 
12.59 8.274e-09 118- 
136 PR00208A 12.59 
8.274e-09 119-137 


795 


PR00205 


CAD HER IN SIGNATURE 


PR00205B 11.39 5.034e- 
16 302-320 PR00205A 
14.73 1.257e-ll 284- 
inn ppnn^fmp 11 fi^ 

j J U rriu u z u _l j . 0 3 

1.333e-ll 337-352 


796 


BL004 12 


Neuromodul in (GAP -43) 
proteins . 


BiJu v7 X^U -i. U • -J — "x . U w w 

12 196-247 BL00412D 
16.54 5.705e-ll 197- 
248 BL00412D 16.54 
7.848e-l0 199-250 
BL00412D 16.54 l.B27e- 
09 195-246 BL00412D 
16.54 1.9l8e-09 194- 
245 BL00412D 16.54 
2.102e-09 201-252 


797 


BL00021 


Kringle domain proteins . 


BL00021B 13.33 6.339e- 
13 40-58 


799 


BL01052 


Calponin family repeat 
proteins . 


BL01052C 18.51 l.OOOe- 
40 87-127 BL01052A 
16.12 1.529e-32 3-35 
BL01052B 15.31 1.257e- 
25 52-78 BL01052D 
10.26 5.737e-25 174- 
194 


800 


BL00348 


p53 tumor antigen 
proteins . 


BL00348F 23.19 3.714e- 
09 197-240 


801 


BL00309 


Vertebrate galactoside- 
binding lectin proteins . 


BL00309C 18.65 1.621e- 
09 62-87 


802 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245D 10.47 5.224e- 
09 187-199 


804 


PF00774 


Dihydropyridine 
sensitive L-type calcium 
channel (Beta subuni . 


PF00774A 16.47 8.457e- 
10 110-156 


808 


PR00667 


RETINAL PIGMENT 
EPITHELIUM- RETINAL GPCR 
SIGNATURE 


PR00667C 11.71 9.875e- 
09 12-28 


810 


PD02346 


PHOTOSYSTEM II PROTEIN 
PRECURSOR 


PD02346F 12.89 4.340e- 
09 317-354 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PHOTOSYNTHESIS. 




811 


BL00685 


CBF-A/NF-YB subunit 
proteins . 


BL00685B 14.41 6 . 779e- 
14 54-95 BL00685A 
11.22 4.798e-l3 5-54 


812 


PR00080 


ALCOHOL DEHYDROGENASE 
SUPERFAMILY SIGNATURE 


PR00080A 9.32 9.419e- 
10 93-105 


813 


BL00357 


Histone H2B proteins . 


BL00357 7.74 1.988e-17 
22-65 


815 


PD00066 


PROTEIN ZINC-FINGER 
METAL- BIND I . 


PD00066 13.92 7.923e- 
15 158-171 PD00066 
13.92 5.200e-14 46-59 
PD00066 13.92 7.000e- 
■14 18-31 PD00066 
13.92 7.000e-l3 130- 
143 PD00066 13.92 
7.500e-13 214-227 
PD00066 13.92 9.000e- 
13 102-115 PD00066 
13.92 4.429e-l2 186- 
199 PD00066 13.92 
1.783e-ll 74-87 


816 ; 


BL01195 


Peptidyl-tRNA hydrolase 
proteins . 


BL01195C 20.12 3.348e- 
20 100-139 


820 


BLC0520 


Interleukin-10 family 
proteins . 


BL00520A 6 . 21 6.471e- 
09 1-14 


822 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972A 11.93 8.113e- 
09 224-242 


825 


PR00876 


NEMATODE METALLOTHI ONE IN 
SIGNATURE 


PR00876B 7.66 2.268e- 
10 101-115 


829 


PD02855 


FLAVOPROTEIN PROTEIN 
DNA/PANTOTHEN. 


PD02855A 18.37 4.732e- 
28 88-124 PD02855B 
8.36 6.478e-09 132-142 


830 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 7.000e- 
21 44-62 PR00405C 
19.41 1.000e-13 65-87 
PR00405A 17 .71 7.283e- 
13 25-45 


831 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019A 11.19 l.OOOe- 
09 47-61 PR00019B 
11.36 1.720e-09 136- 
150 PR00019B 11.36 
3 .880e-09 44-58 


832 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllB 13 .08 3.438e- 
16 164-183 PROOOllD 
14.03 6.850e-l6 164- 
183 PROOOllA 14.06 
8.364e-14 164-183 
PROOOllC 2h . 25 b.41be- 
12 231-260 PROOOllD 

231 


834 


PD0O306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 232-246 


835 


PD003C6 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 4.000e- 
10 290-304 


836 


PDO 0306 


ODPi'PTrTW PT vrYiDDrcrcTM 
fKUlc-lW laLiUJUfKvJl alSS 

PRECURSOR RE. 


rUwUJ Uoa iu . zo ' . uuuc 
12 216-230 


837 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 3.89Be- 
09 78-111 


839 


PD02784 


PROTEIN NUCLEAR 
R I BONUCLEOPROTE IN . 


PD02784B 26.46 8.302e- 
09 73-116 


840 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700B 16.80 5 . 091e- 
22 369-390 PR00700D 
12.47 5 .765e-21 491- 
510 PR00700C 13.17 
4.750e-14 449-467 
PR00700F 11.18 8.500e- 
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SEQ ID NO: 


ACCESSION 
NO . 


DESCRIPTION 


RESULTS* 








11 538-549 PR00700E 
17.57 3.100e-10 522- 
538 


B4 1 




TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 5.404e- 
13 134-153 


844 


PD02785 


PROTEIN RIBOSOMAL 60S 
L22 RNA-BINDING HEP. 


PD02785B 14.43 l.OOOe- 
40 58-112 PD02785A 
15.23 1.915e-2B 8-57 


845 


BLC0826 


MARCKS family proteins. 


BL00326C 7.63 6.738e- 
09 203-230 


846 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 4.429e- 
10 15-24 


849 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 l.OOOe- 
08 340-349 


850 


PROO308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308A 5.90 6.506e~ 
09 12-27 


851 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR . 


PD02411 21.89 7.000e- 
16 246-280 


852 


BL00420 


Speract receptor repeat 
proteins domain 
proteins . 


BL00420B 22.67 l.OOOe- 
40 723-778 BL00420B 
22.67 1.321e-38 933- 
988 BL00420B 22.67 
8.457e-28 482-537 
BL00420B 22.67 4.500e- 
27 587-642 BL00420B 
22.67 9.625e-27 270- 
325 BL00420B 22.67 
4.20Se-26 163-218 
BL00420B 22.67 5.731e- 
23 55-110 BL00420B 
22.67 6.464e-20 377- 
432 BL00420B 22.67 
2.800e-15 830-885 
BL00420C 11.90 l.SOOe- 
13 355-366 BL00420C 
11.90 1.900e-12 808- 
819 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.831e- 
11 141-152 BL00420C 
11.90 5.119e-ll 1018- 
1029 BL00420C 11.90 
7.955e-10 567-578 


853 


BL00420 


Speract receptor repeat 
proteins domain 
proteins . 


BL00420B 22.67 l.OOOe- 
40 756-811 BL00420B 
22.67 1.321e-38 966- 
1021 BL00420B 22.67 
8.457e-28 482-537 
BL00420B 22.67 4.500e- 
27 620-675 BL0042OB 
22.67 9.625e-27 270- 
325 BL00420B 22.67 
4.205e-26 163-218 
BL00420B 22.67 5.731e- 
23 55-110 BL00420B 
22.67 6.464e-20 377- 
432 BL00420B 22.67 
2.800e-l5 863-918 
3L00420C 11.90 1.900e- 
13 355-366 BL00420C 
11.90 1.900e-12 841- 
852 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.831e- 
11 141-152 BL00420C 
11.90 5.119e-ll 1051- 
1062 BL00420C 11.90 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 










7.955e-10 567-578 


857 


PR00388 


3 • , 5 ■ -CYCLIC NUCLEOTIDE 
CLASS II 

PHOSPHODIESTERASE 
SIGNATURE 


PR00388A 10.45 2.77Be- 
09 64-83 


859 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 2.929e- 
13 37-56 BL00030B 
7.03 1.900e-U 167-177 
BL00030A 14.3 9 2.000e- 
10 128-147 


861 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.250e- 
17 23-41 PR00988C 
13.64 8.714e-l6 107- 
123 PR00988F 12.23 
7.828e-15 198-212 
PR00988E 8.27 9.769e- 
12 176-188 PR00988D 
5.95 8.250e-ll 163-174 
PR00988B 11.60 4.512e- 
10 60-72 


863 


BL00215 


Mitochondrial energy- 
transfer proteins. 


BL00215B 10.44 8.071e- 
12 41-54 


864 


PR00775 


90 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR00775E 8.06 1 . OOOe- 
24 198-221 PR00775B 
3.52 1.837e-23 107-130 
PR00775D 8.91 4.484e- 
17 171-189 PR00775A 
9.90 8.342e-17 86-107 
PR00775C 10.68 9.379e- 
17 153-171 PR00775G 
10.64 6.850e-15 267- 
zoo FKUU / /or l<d,/b 
6.769e-14 249-267 


866 


DM01688 


2 POLY-IG receptor. 


IJpIUlboolj lo .43 3. 'touts- 

09 89-121 


867 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 5.596e- 
29 14-53 


868 


BL012B7 


RNA 3' -terminal 
phosphate cyclase 
proteins . 


n LlU X Z. O 1 H. 1 / .33 < , 6DDC- 

26 16-48 


869 


DM00215 


PROLINE-RICH PROTEIN 3. 


r\M n r\ o i c i Q c. ^cia. 
DMU uz lr> 17 . »J o.^o^e- 

10 304-337 


872 


BL00046 


Histone H2A proteins. 


BL00046 12.95 l.OOOe- 
40 3 0-85 


874 


BL00188 


Eiotin-requiring enzymes 
attachment site 
proteins . 


BL00188 30.29 9.036e- 
32 665-711 


876 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.686e- 

OQ D QR 1 *i 
\Jy £7g*Jl3 


877 


PD02102 


SUB UN IT E V-ATPASE 
VACUOLAR ATP SYNTHASE 
HYDROL. 


PD02102A 16.74 4.176e- 
10 97-141 


879 


BL01189 


Ribosomal protein S12e 
proteins . 


BL01189A 14.27 l.OOOe- 
40 35-71 BL01189B 

it a q t nnno.^n *7 1 .. 1 O C 
i.J . 4r 1 . UUlie*4U 'i"l/5 


8 82 




Cerr* i no y^T*ot"#*ir»C 


BL00284C 28.56 6.400e- 
25 62-104 BL00284B 
17.99 6.182e-12 35-56 


889 


BL00216 


Sugar transport 
proteins . 


BL00216B 27.64 4.375e- 
21 35-85 


896 


PR00391 


PHOSPHATIDYL INOSITOL 
TRANSFER PROTEIN 
SIGNATURE 


PR00391E 12.50 7.785e- 
15 211-231 PR00391B 
8.39 l.O00e-13 83-104 
PR00391D 12 . 21 9.328e- 
13 191-207 PR00391A 
7.83 5.390e-ll 16-36 


897 


PRO0327 


ICE NUCLEATION PROTEIN 


PR00327C 6.37 5.247e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


09 313-328 


898 


BL00039 


DEAD -box subfamily ATP- 
dependent helicases 
proteins . 


BL00039D 21.67 7.800e- 
26 3B6-432 BL00039A 
18.44 6.674e-16 113- 
1S2 BL00039B 19.19 
1.947e-13 153-179 
BL00039C 15.63 9.460e- 
11 236-260 


901 




PROTEIN ZINC-FINGER 
METAL -BINDI . 


PD00066 13.92 8.200e- 
16 254-267 PD00066 
13.92 8.200e-16 282- 
295 PD00066 13.92 
8.200e-16 310-323 
PD00066 13 .92 8.200e- 
16 366-379 PD00066 
13 .92 8 .200e-16 394- 
407 PD00066 13.92 
8.200e-14 338-351 


902 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 9.321e- 
11 6-50 


903 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4 .28 9.l60e- 
09 97-111 


904 


PR00381 


KINESIN LIGHT CHAIN 
SIGNATURE 


PR00381E 8.75 6.586e- 
25 335-356 PR00381B 
18.17 2.667e-24 204- 
224 PR00381A 9.55 
2.800e-24 107-125 
PR00381C 12.48 4.522e- 
24 226-245 PR00381D 
13.94 1 .084e-22 291- 

ifiQ □'□nmmi? o i ^ 
JU? JrKUUjoJLr 7, 1J 

3.288e-22 370-392 

rKUUJOir ? . 1J /.jLoxe~ 

13 286-308 PR003B1E 

O . / -J 1 • UOOC 14, 4Jl £ / i 

PR00381E 8.75 7.033e- 

11 293-314 PR00381E 
8.75 8.364e-10 377-398 
PR003B1D 13.94 5.230e- 
09 333-351 PR00381C 

12 .48 7 ,120e-09 310- 
329 


906 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00345C 4 .54 8.557e- 
09 525-549 


907 


PR0O345 


STATHMIN FAMILY 
SIGNATURE 


PR00345C 4 .54 8.557e- 
09 513-537 


908 


BLC0678 


Trp-Asp (WD) repeat 
proteins proteins . 


BL00678 9.67 9.308e-ll 
144-155 


910 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.800e- 
30 48-87 


912 


BL01104 


Ribosomal protein L13e 
proteins . 


BL01104C 15.14 6.000e- 
09 364-392 


922 


BL0Q678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 3.842e-09 
500-511 


923 


PR00320 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 1 


PR00320C 13.01 2.500e- 
09 323-338 PR00320C 
13 .01 5.500e-09 187- 
202 


924 


PD021B1 


PROTOCHLOROPHYLLIDE 
REDUCTASE PHOTOS YNT . 


PD02181D 12.85 8.609e- 
09 36-64 


926 


BL00019 


Actinin-type actin- 
binding domain proteins. 


BL00019C 14.66 7.453e- 
25 108-144 BL00019B 
13.34 6.5l0e-ll 61-84 
BL00019D 15.33 9.338e- 
11 205-235 BL00019A 
12.56 2.373e-10 34-45 


928 


BL00678 


Trp-Asp (WD) repeat 


BL00678 9.67 9.308e-ll 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins proteins. 


273-284 BL00678 9.67 
1.6O0e-10 314-325 
BL00678 9.67 7.600e-10 
360-371 BL00678 9.67 
8.579e-09 206-217 


929 


BLO0518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 1.857e- 
10 137-146 


930 


BL01085 


Ribulose-phosphate 3- 
epiroerase family- 
proteins. 


BL01085D 16.55 4.600e- 
24 134-165 BL01085B 
10.15 5.680e-22 30-52 
BL01085E 18.87 8.676e- 
20 172-202 BL01085C 
21.81 2.038e-14 66-97 


931 


BLO1085 


Ribulose-phospnate 3- 
epimerase family 
• proteins. 


BL01085D 16.55 4.600e- 
24 152-183 BL01O85B 
10.15 5.680e-22 30-52 
BL01085E 18.87 8.676e- 
20 190-220 BL01085C 
21.81 2.038e-14 66-97 


933 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM-BI. 


PD00301A 10.24 6.400e- 
09 160-171 


936 


PF00168 


C2 domain proteins. 


PFO0168C 27.49 4.000e- 
12 336-362 


937 


BL00415 


Synapsins proteins. 


BL00415N 4 . 29 9 . 519e- 
10 5-49 


940 


PR0O862 


PROLYL OLIGOPEPTIDASE 
SERINE PROTEASE (S9A) 
SIGNATURE 


PR00862D 16.17 4.086e- 
09 63-84 


94 5 


BL01230 


RNA methyl transferase 
trniA family proteins. 


BL01230B 11.62 2.373e- 
09 407-420 


94 8 


BLO 0479 


Phorbol esters / 
diacylglycerol binding 
domain proteins . 


BL00479B 12.57 7.429e- 
18 52-68 BL00479A 
19.86 2.200e-13 26-49 


a * Q ' 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 1.474e-C9 
100-111 


954 


PD01311 


PROTEIN OXIDORBDUCTASE 
NAD INTERGENIC RE. 


PD01311A 30 .23 5.909e- 
10 66-111 


955 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 3.250e- 
12 47-60 


956 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 3.250e- 
12 47-60 


957 


BL00379 


CDP- alcohol 

phosphatidyl transferases 
proteins. j 


BL00379 24.^4 l.^lOe- 
15 111-148 




BL01 115 


GTP- binding nuclear 
protein ran proteins. 


BL01115A 10.22 1.884e- 
10 31-75 


960 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 3.438e- 
14 110-154 


962 


BL00061 


Short -chain 

dehydrogenases /reductase 
s family proteins. 


BL00061B 25.79 6.586e- 
13 198-236 


963 


PR00502 


MUTT DOMAIN SIGNATURE 


PR00502A 15.06 8.200e- 
11 210-225 


966 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308A 5.90 7.035e- 
09 55-70 


967 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01205B 10.69 1.286e- 
12 104-124 DM01206B 
10.69 5.299e-ll 23-43 
DM01206B 10.69 8.274e- 
10 73-93 DM01206B 
10.69 3.962e-09 108- 
128 DM01206B 10.69 
5.671e-09 38-58 


969 


PF01008 


Initiation factor 2 
subunit . 


PF01008B 25.59 4.724e- 
31 417-460 PF01008C 
12.25 5.333e-18 506- 
526 PF01008A 20.14 
5.875e-15 369-390 
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NO. 


DESCRIPTION 


RESULTS* 


970 


BL.01 277 


Ribonucleaee PH 
proteins . 


BL01277C 10.18 7.648e- 
10 112-143 BL01277A 
17.39 9.806e-10 40-78 


975 




WW/rsp5/WWP domain 
proteins . 


BL01159 13.85 3.605e- 
12 130-145 BL01159 
13 .85 4 .122e-10 171- 
186 


977 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791C 20.98 2.235e- 
09 55-94 


978 


BL01167 


Ribosomal protein L17 
proteins . 


BL01167B 20.66 8.258e- 
19 88-127 


979 


BL00478 


LIM domain proteins. 


BL00478B 14.79 9.357e- 
13 33-48 BL00478B 
14.79 7.250e-12 98-113 


980 


PR00312 


CALSEQUESTRIN SIGNATURE 


PR00312E 8.32 3.423e- 
36 169-199 PR00312I 
15.78 5.286e-35 332- 
361 PR00312F 15.06 

3 . BD3C J 3 Z Z J 

PR00312H 13.31 8.313e- 

J J ^Qj rRUUJItU 

13 .73 5 .688e-34 363- 
392 PR00312D 9.43 
2.636e-33 128-158 
PR00312C 15.14 8.839e- 
33 92-122 PR00312B 
15.08 8.941e-33 62-92 
PR00312G 11.11 6.657e- 
32 230-258 PR00312A 
11 . 70 6 .914e-27 35-59 


981 


PF00992 


Troponin . 


PF00992A 16.67 8.81Se- 
09 414-449 


982 


PR00299 


ALPHA CRYSTALLIN 
SIGNATURE 


PR00299F 13.20 2.367e- 
09 127-149 


9B3 


BL01150 


Respiratory- chain NADH 
dehydrogenase 20 Kd 
subunit proteins. 


BL01150B 17.16 l.OOOe- 
40 156-202 BL01150A 
14.10 8.200e-39 100- 
138 


986 


BL00795 


Involucrin proteins . 


BL00795C 17.06 7.211e- 
14 4-49 BL00795C 
17.06 1.778e-ll 1-46 
BL00795C 17.06 3.407e- 
10 14-59 BL00795C 
17.06 7.802e-10 2-47 
BL00795C 17.06 8.640e- 
10 19-64 BL00795C 
17 .06 7.400e-09 11-56 
BL00795C 17.06 7.800e- 
09 3-48 


987 


BL00939 


Ribosomal protein Lie 
proteins . 


BL00939F 17.27 5.393e- 
09 810-840 


988 


PR00452 


SH3 DOMAIN SIGNATURE 


PR004S2B 11.65 6.538e- 
11 525-541 


989 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e- 
11 497-513 


994 


3L00027 


' Horaeobox • doma in 
proteins . 


BL00027 26.43 2.500e- 
25 146-189 


997 


BL01304 


ubiH/COQ6 monooxygenase 
family proteins. 


BL01304A 8 .05 3 .893e- 
11 65-79 


998 


DM01767 


5 TRANSMITTER DOMAIN. 


DM01767B 10.07 7.868e- 
09 22-39 


1000 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926C 16.07 1.750e- 
24 73-94 PR00926D 
10.53 3.250e-23 126- 
145 PR00926F 17.75 
6.211e-23 217-240 
1 PR00926E 11.70 6.625e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 








20 174-193 PR00926B 








16.07 2.125e-18 24-39 








PR00926A in 4T i onn#»- 








15 11-25 PR00926F 








17.75 5.565e-09 120- 








143 


1005 


BL00406 


Actins proteins. 


BL00406B 5.47 l.OOOe- 
40 88-143 BL00406C 

BL00406D 12.58 3.700e- 
40 270-325 BL00406E 
8.44 7.375e-38 327-377 
BL00406A 9.95 3.348e- 
29 11-46 


1006 


BL00406 


Actins proteins. 


BL00406B 5.47 l.OOOe- 
40 88-143 BL00406C 

C *7C 1 A A ft a. A A i j »> •*» O 

b . /b 1.0U0e-40 147-202 
BL00406E 8.44 l.OOOe- 
35 248-298 BL0O4O6A 


1007 


PR003O4 


TAILLESS COMPLEX 
POLYPEPTIDE 1 

fPUADPDnMPi CTr* MUTT TOP 


PR00304D 11.04 8.714e- 
22 384-407 PR00304C 
o . by s , oo /e-zu yo-xio 
PR00304B 11.60 7.577e- 

1? DO — O I rKUUJU4A 

9.20 3.382e-16 46-63 
PR00304E 7.79 6.870e- 
13 418-431 


1009 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
RTNTHNH NTT 


PD01066 19.43 2.929e- 
32 9-48 


1011 


PD01066 


PROTEIN ZINC FINGER 
ZINC -FINGER METAL - 
BINDING NU . 


PD01066 19.43 2.929e- 
32 68-107 


1012 


BL00518 


Zinc finger, C3HC4 type 
{RING finger) , proteins. 


BL00518 12.23 6.143e- 
10 64-73 


1016 


PD01163 


SYNTHETASE LIGASE 
PROTEIN ALANYL. 


PD01168H 12.08 l.OOOe- 
11 174-194 


1018 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION . 


PD00930B 33.72 1.391e- 
32 261-302 PD00930A 
25.62 9.550e-22 157- 
183 


1022 


BL00175 


Phosphoglycerate mutase 
family phosphohist idine 
proteins . 


BL00175A 15.42 5.179e- 

12 6-26 BL00175C 

23 .75' 8.062e-10 79-111 


1025 


PR00305 


14-3-3 PROTEIN ZETA 
SIGNATURE 


PR00305D 16.34 1.439e- 
10 158-185 


1026 


BL003 53 


HMGl/2 proteins. 


BL00353B 11.47 2.436e- 
18 238-288 BL00353C 
14.83 8.844e-ll 288- 
335 


1028 


BL001B3 


Ubi qui tin- conjugating 
enzymes proteins. 


BL00183 28.97 1.310e- 
33 43-91 


1033 


PF00580 


UvrD/REP helicase. 


PF0058OA 13.37 4.720e- 
09 111-133 


1034 


PR00413 


HALOACID 

DEHALOGENASE/ EPOXIDE 
HYDROLASE FAMILY 
SIGNATURE 


PR00413E 15.78 3.429e- 
09 154-171 


1037 


PD01066 


PROTEIN ZINC FINGER 
ZINC -FINGER METAL - 
BINDING NU. 


PD01066 19.43 9.657e- 
09 5-44 


1038 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PD01796 15.01 4.259e- 
11 55-82 


1039 


BL00299 


Ubiquitin domain 
proteins . 


BL00299 28.84 9.036e- 
09 17-69 


1040 


PR00970 


ARGININE ADP- 

R I BOS YLTRANS FERAS E 


PR00970A 17.73 6.143e- 
20 56-78 PR00970D 
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SEQ ID NO: 


ACCESSION 


DESCRIPTION 


RESULTS* 






SIGNATURE 


9.96 2.154e-18 154-171 
PR00970F 12.30 l.OOOe- 
16 224-241 PR00970G 

9.97 9.229e-l5 242-25B 
PR00970B 16.37 1.290e- 
13 86-105 PR00970C 
11.05 1.643e-ll 115- 
130 PR00970E 11.23 
9.820e-ll 202-218 


1042 


RT.nnfi7R 

DUuUu 9 O 


Trp-Asp (WD) repeat 
proteins proteins. 


BL006 7 8 9.67 2.200e-10 
243-254 


1043 


PR00048 


C2H2-TYPE ZINC FINGER 

O A. vJlvrV 1 U t\.Hi 


PR00048A 10.52 6.786e- 
13 114-128 PR00048A 
10.52 1.000e-09 172- 
186 


1045 


BL00615 


C-type lectin domain 


BL00615A 16.68 1.720e- 
11 218-236 BL00615B 
12.25 1.857e-10 317- 
331 


1046 


BL01092 


Adenylate cyclases 
class- I proteins. 


BL01092N 13 .54 8.924e- 
10 3-40 


104 7 


dt m on c 


t\ L tr C X Li a lc lyaoc / 

succinyl-CoA ligases 
family proteins . 


28 314-344 BL01216A 
13.91 1.000e-10 97-112 


1049 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 7.618e- 
12 102-136 


1050 


BLiOlD /J 


KlDOSOmaJ- protein i_iz»*e 
proteins . 


40 12-62 




bbUUD / ± 


Ami r^s eoc nrnh oi no 
iiiuluapep piULciiib 1 


BL0Q571 25 6"9 5 B75e- 
31 160-212 


1055 


ELiUUU J 0 


IP, , L- -, ,« r nf i r* T3M2V _ V> -i "i V»f*Y 

CiUKaryOLic xvWA- uinaing 

region RNP-1 proteins. 


rU.nnm 14 tq 2i c >>- 
11 98-117 BL00030B 
7.03 4.3l6e-05 137-147 


1058 




Anncxins Eeptseiu plulcxhsj 
domain proteins. 


23 262-317 BL00223A 
15.59 9.478e-14 46-80 
BL00223A 15.59 5.557e- 
11 118-152 


1060 


BL00027 


1 Homeobox 1 domain 
proteins. 


BL00027 26.43 3.455e- 
35 158-201 


1064 


oLiUU4t)b 


Jrucacive nnr -iJiiiciiriLj 
domain proteins . 


13 280-296 


1065 


PR0OU19 


SIGNATURE 


DDrtnm q& 1 1 to 0 non*» - 
09 115-129 PR00019B 
11 76 3 BB0e-09 87-101 


1066 


PR00326 


GTP1/0BG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 4.600e- 
16 151-172 PR00326C 
9.79 1.290e-14 200-216 
PR00326B 16.74 8.548e- 
14 172-191 PR00326D 
19.09 1.257e-13 217- 
236 


1071 


PDD2 8 70 


RECEPTOR INTERLEUKIN-1 
PRECURSOR. 


PD02870B 18.83 B.518e- 
11 164-197 


1072 


PF00856 


SET domain proteins. 


PF00856A 26.14 5.976e- 
09 350-387 


1075 


BL01009 


Extracellular proteins 
SCP/Tpx- 1 /Ag5/PR- 1 /Sc7 
proteins . 


BL01009D 14.19 4.300e- 
20 127-148 BL01009A 
13.75 6.586e-13 57-75 
BL01009E 13.50 1.439e- 
11 159-175 


1077 


PR00724 


CARBOXYPEPTIDASE C 
SERINE PROTEASE (S10) 
FAMILY SIGNATURE 


PR00724A 10.91 l.OOOe- 
08 366-379 


1078 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 l.OOOe- 
12 170-195 BL00215A 
15.82 7.529e-10 79-104 


1079 


BL00678 


Trp-Asp (WD) repeat 


BL00678 9.67 4.3l6e-09 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins proteins. 


298-309 


1081 


BL00326 


Tropomyosins proteins . 


BL00326A 14.01 7.398e- 
10 23-57 


1094 


BL0O460 


Glutathione peroxidases 
selenocysteine proteins . 


BL00460A 28.67 3.204e- 
18 57-92 BL00460B 
9.73 6.400e-13 100-118 
BL00460D 16.89 9.143e- 
12 162-182 BL00460C 
14.35 5.500e-09 133- 
156 


1095 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PI LB 
FIMBRIA TRAN. 


PD02811A 20.67 3.017e- 
22 67-105 PD02811B 
17.07 2.263e-21 118- 
151 PD02B11C 13.25 
5.696e-13 154-lf37 


1096 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PILB 
FIMBRIA TRAN. 


PD02811A 20.67 3.017e- 
22 60-98 PD02811B 
17.07 2.263e-21 111- 
144 PD02811C 13.25 
5.696e-13 147-160 


1097 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479B 12.57 6.143e- 
09 200-216 




fr u u o ox 


XilLbOIcQUCtaSc 1 dull- _L y . 


PPflflflflia 71 15 Q 5?Q«- 

ir r \J \J a O .I t\ & f . J- -J 7 i ££?C 

13 111-147 


110 9 


pp nnAAQ 


RAS SIGNATURE 


PR00449A 13 20 \ 077e»- 
10 15-37 PR00449E 
13.50 1.857e-09 185- 
208 PR00449D 10.79 
8.364e-09 131-145 


1115 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 5.737e- 
20 42-60 PR00405A 
17.71 2.703e-17 23-43 
PR00405C 19.41 6.902e- 
10 63-85 


1116 


BL00355 


HMG14 and HMG17 
proteins . 


BL00355 5.97 2.528e-25 
20-51 


1117 


BL00355 


HMG14 and HMG17 
proteins . 


BL00355 5.97 2.528e-25 
20-51 


1120 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL001O7B 13.31 4.8S7e- 
10 290-306 


1123 


PR00412 


EPOXIDE HYDROLASE 
SIGNATURE 


PR00412F 18.76 9.526e- 
12 301-324 


1125 


PR00186 


HEMERYTHRIN SIGNATURE 


PR00186A 13.62 2.800e- 
09 87-101 


1129 


BL00170 


Cyclophilin- type 
pep t idyl -prolyl cis- 
trans isomerase 
signatur . 


BL00170C 18.49 3.077e- 
33 84-129 BL00170B 
20.97 6.838e-25 37-77 
BLO017OA 17.08 3.4S5e- 
15 10-37 


1131 


BL00636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 5.304e- 
15 29-46 BL00636B 
15.11 1.360e-14 59-80 


1132 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 


1133 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 


1136 


BL00990 


Clathrin adaptor 
complexes medium chain 
proteins . 


BL0099OC 18.78 4.176e- 
38 235-269 BL00990A 
21 .44 4.316e-36 94-132 
BL0099OB 20.15 2.l25e- 
27 157-187 BL00990D 
16.13 5.320e-18 403- 
422 


1137 


PR00314 


CLATHRIN COAT ASSEMBLY 
PROTEIN SIGNATURE 


PR00314B 15.68 S.OOOe- 
34 100-128 PR00314D 
9.66 3.531e-33 233-261 
PR00314C 16.05 8.909e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








32 159-188 PR00314A 
14.53 1.281e-22 13-34 


1139 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 6.364e- 
13 13-57 


1141 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107A 18.39 4.000e- 
19 451-482 BL00107B 
13.31 3.077e-12 519- 
535 


1148 


PR00685 


TRANSCRIPTION INITIATION 
FACTOR I IB SIGNATURE 


PR00685A 13.62 4.676e- 
09 21-42 


1155 


PD01652 


RECEPTOR CELL NK 
GLYCOPROTEIN IMMUNOGLOB . 


PD01652B 8.50 9.396e- . 
10 522-574 PD01652B 
8.50 9.463e-10 740-792 


1157 


PD02894 


HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 


PD02894A 21.96 7.873e- 
28 81-127 PD02894B 
13.93 1.188e-27 178- 
211 


1159 


BL00523 


GMC oxidoreductases 
proteins . 


BL00623E 15.00 3.53le- 
20 391-414 BL00623C 
10.86 4.240e-20 155- 
176 j 


1161 


PD01937 


DNA PROTEIN POLYMERASE 
ENDONUCLEASE DNA- . 


PD01937A 6.68 3 .475e- 
09 330-341 


1162 


PD01937 


DNA PROTEIN POLYMERASE 
ENDONUCLEASE DNA- . 


PD01937A 6.68 3 .475e- j 
09 221-232 


1163 


PR00624 


HI STONE H5 SIGNATURE 


PR00624D 11.94 7 . 455e- 
10 214-239 PR00624D 
11.94 1.961e-09 312- 
337 


1167 


BL00226 


Intermediate filaments 
proteins . 


BL00226B 23.86 7.384e- 
09 302-350 


1177 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032G 8.33 1 .422e- 
10 34-48 


1178 


PRO 03 20 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320A 16.74 1.794e- 
10 205-220 PR00320C 
13.01 7.840e-10 205- 
220 PR00320B 12.19 
8.457e-10 35-50 
PR00320A 16.74 7.146e- 
09 35-50 PR00320B 
12.19 9.100e-09 79-94 


1180 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454D 10.89 4.150e- 
19 765-784 


1181 


BL00291 


Prion protein. 


BL00291A 4.49 8.962e- 
11 152-187 


1184 


BL00720 


Guanine-nucleotide 
dissociation stimulators 
CDC25 family sign. 


BL00720B 16.57 4.103e- 
18 1089-1113 


1185 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 4.553e- 
13 204-229 BL00215A 
15.82 1.429e-12 11-36 
BL00215A 15.82 9.809e- 
11 104-129 


1187 


BL00983 


Ly-6 / u-PAR domain 
proteins . 


BL00983C 12.69 2.76le- 
10 77-93 


1188 


BL00878 


Orn/DAP/Arg 

decarboxylases family 2 
pyridoxal-P attachment 
si. 


BL00878B 10.95 6.000e- 
16 189-204 BL00878C 
17.74 8.435e-15 225- 
245 BL00878F 19.67 
3.625e-13 379-402 
BL0087BD 16.56 1.621e- 
09 270-289 


1191 


PD0293 9 


PROTEIN GLUTATHIONE 
SYNTHETASE SY . 


PD02939B 10.10 2.723e- 
12 203-220 PD02939C 
20.01 1.000e-ll 224- 
252 


1193 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00345B 7.12 2.800e- 
28 72-101 PR00345E 
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8.54 7.652e-28 149-174 
PR00345C 4.54 9.100e- 
28 101-125 PR00345D 
10.97 1.964e-24 125- 

5.645e-16 43-62 


1194 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


nnnni/icc "7 1 1 o Qnn<^ 
PKUUi4bB i . xZ. z.oUUe - 

28 108-137 PR00345E 
8.54 / .DO^e-^o 1B5-Z1U 
PR00345C 4.54 9.100e- 
28 137-161 PR00345D 

in QT T QC4a-T4 1 £ 1 - 
J.U . y / JL ■ 3648-^4 iol- 

185 PR00345A 13.46 
b.o4be-ib /y-yo 


1195 


PF00995 


Seel family. 


PF00995B 17.37 1.120e- 
13 224-264 


1196 


BL009B2 


Bacterial -type phytoene 
dehydrogenase proteins. 


BL009B2A 18.41 6.738e- 
11 15-47 


1197 


BL01298 


Dihydrodipicolinate 
reductase proteins. 


BL01298A 13.90 5.959e- 
09 51-73 


1203 


BL00061 


Short-chain 

dehydrogenases/reductase 
s family proteins . 


BL00061B 25.79 l.OOOe- j 
14 152-190 


1204 


PR00118 


BETA- LACTAMASE CLASS A 
SIGNATURE 


PR00118F 16.42 9.386e- 
09 213-229 


1206 


BL01183 


ubiE/COQ5 

methyl transferase family 
proteins . 


BL01183B 21.31 1.429e- 
37 184-229 BL01183D 
27.71 8.535e-27 264- 
307 BL01183A 13 .25 
3.250e-23 51-73 
BL01183C 10.77 5.295e- 
09 246-258 


1208 


BL00979 


G-protein coupled 
receptors family 3 
proteins . 


BL00979L 20.63 2.485e- 
09 105-146 


1209 


PFCO023 


Ank repeat proteins. 


PF00023A 16.03 4.857e- 
11 49-65 PF00023B 
14.20 1.8l8e-09 45-55 


1212 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00D48A 10.52 7.750e- 
14 227-241 PR0004 8A 
10.52 4.3l6e-ll 199- 
213 


1213 


PR00450 


RECOVERIN FAMILY 
SIGNATURE 


PR00450C 12.22 1.720e- 
10 20-42 PR00450C 
12.22 3.506e-09 56-78 
PR00450D 16.58 6.769e- 
09 44-64 


1216 


BL00412 


Neuromodul in (GAP -43) 
proteins . 


BL00412D 16.54 5.598e- 
10 179-230 


1219 


PR00456 


RIBOSOMAL PROTEIN P2 
S IGNATURE 


PR00456E 3 .06 5.348e- 
11 24y-^t>4 


1222 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 7.23le- 
15 295-308 rUUOvJoo 
13.92 7.231e-15 406- 
419 PDO0066 13.92 
2.286e-12 378-391 
PD00066 13.92 7.857e- 
12 434-447 PD00066 
13.92 3.348e-ll 350- 
363 


1223 


BL50058 


G-protein gamma subunit 
profile . 


BL50058 27.23 l.OOOe- 
40 13-61 


1226 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 8.439e- 
09 279-330 


1227 


BL00437 


Catalase proximal heme- 
ligand proteins. 


BL00437A 18.82 l.OOOe- 
40 49-101 BL00437B 
16.28 l.OO0e-40 114- 
168 BL00437C 21.86 
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1.000e-40 190-239 

40 248-301 BL00437E 
23.95 1.000e-40 327- 
379 


1230 


BL0116O 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 8.297e- 
10 6-60 


1231 


PR00735 


GLYCOSYL HYDROLASE 


PR00735A 11.19 6 .857e- 
ftq iqi -405 


1232 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P4 0 SIGNATURE 


PR00497A 6.92 5.553e- 
10 158-176 


1233 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P4 0 SIGNATURE 


PR00497A 6.92 5.553e- 
10 158-176 


1235 


BL00866 


Ca rbamoy 1 - phospha t e 
synthase subdomain 
proteins . 


Dbuuobbn jo .t? z . / /oe- 
09 75-121 


1237 


BL00027 


'Homeobox' domain 
proteins . 


BL00027 26.43 1.818e- 
21 36-79 


1243 


PR00403 


WW DOMAIN SIGNATURE 


PR00403B 12.19 1.184e- 
11 10-25 


1246 


PD01168 


SYNTHETASE LIGASE 
PROTEIN ALANYL . 


PD01168L 9.47 2.837e- 
10 31-46 PD01168L 

ft A 1 A A O A 1 ft IT/1 tOQ 

9.47 4.490e-lU 1/4-lby 
PD01168L 9.47 7.612e- 
10 loj-liro 


1249 


BL00018 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 2.800e-10 
183-196 


1254 


BL0O183 


Ubiqui tin- conjugating 
enzymes proteins. 


BL00183 28.97 2.440e- 
36 96-144 


1255 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 5.670e- 
11 8-52 


1256 


BL00373 


— ^ il ' l_ S ^ ' r— j 

Phosphor lbosylglycmanud 
e formyl transferase 
proteins . 


BL00373C 10.35 3.348e- 
12 143-156 


1258 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011B 13.08 3.217e- 
10 174-193 


1259 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 8.2B6e- 
10 31-40 


1261 


PR00070 


DIHYDROFOLATE REDUCTASE 
SIGNATURE 


PR00070D 11.63 l.OOOe- 
15 112-127 PR00070C 
13.09 9.500e-15 51-63 

PKUUU /UA LZ.yZ b.aUUe- 

12 16-27 


1262 


BL00462 


Gamma - 

glutamyl transpeptidase 
proteins. 


BL00462A 20.89 6.438e- 
24 140-183 BL00462B 
17.88 5.500e-20 230- 
tc7 nT.nn4C3r' 4T 

/ £5±jU 4 J . i ± 

2.023e-ll 292-347 


1263 


BL0 0 03 8 


Myc-type, 1 helix-loop- 
helix 1 diraerization 
domain proteins. 


BL00038B 16.97 9.45Se- 
11 62-83 


1264 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


tit IM 1 1 C7\ 1 ft. TO C ^ Tfla 

BLUlllbA lU . zz b.o/ue- 
11 17-61 


1266 


PR00837 


ALLERGEN V5/TPX-1 FAMILY 
SIGNATURE 


pkouoj /l. i / . <di z./i4e- 
18 165-182 PR00837A 

14.// *t . b 1 ze - ±.4 DD-IUD 
PR00837D 11.12 7.577e- 
12 201-215 


1269 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17.27 9.308e- 
22 40-63 PR00449B 
13.50 1.000e-16 137- 
160 PR00449D 10.79 
3.520e-ll 102-116 


1270 


BL00276 


Channel forming colicins 
proteins . 


BL00276A 8 . B7 1.500e- 
09 17-29 


1275 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRECURSOR IMMUNOGLO . 


PD02327C 15.47 9.769e- 
09 228-243 


1276 


PR00412 


EPOXIDE HYDROLASE 


PR00412B 12.59 7.894e- 
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SIGNATURE 


12 119-135 PR00412C 
11.30 1.857e-ll 165- 
179 PR00412A 13 .23 
3.400e-ll 100-119 


1277 


PF00756 


Putative esterase. 


PF00756C 14.12 9.53Be- 
10 127-157 


1279 


BL00134 


Serine proteases, 
trypsin family, 
histidine proteins. 


BL00134A 11.96 9.325e- 
13 128-145 


1280 


BL01220 


Phosphat idyl ethanol amine 
-binding protein family 
proteins . 


BL01220C 14.75 9.348e- ; 
15 248-276 


~1285 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 2.286e- 
10 33-42 


1287 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 7.182e- 
11 288-343 


1292 


PR00802 


SERUM ALBUMIN FAMILY 
SIGNATURE 


PR00802B 16.51 1.6l0e- 
10 81-105 


1297 


PR00716 


M- PHASE INDUCER 
PHOSPHATASE SIGNATURE 


PR00716C 17.65 5.696e- 
09 23-44 


1298 


BL00478 


LIM domain proteins . 


BL00478B 14.79 6.478e- 
14 268-283 


1301 


BL00127 


Pancreatic ribonuclease 
family proteins. 


BL00127C 31.49 3.571e- 
28 82-126 BL00127B 
26.57 8.800e-28 23-68 


1302 


PR00637 


TYPE 3 BOMBESIN RECEPTOR 
SIGNATURE 


PR00637E 11.27 4.250e- 
09 290-306 


1307 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 5.500e- 
17 13-38 BL00215A 
15.82 1.000e-16 226- 
251 BL00215A 15.82 
2.658e-13 107-132 


1308 


PR00898 


VASOPRESSIN V2 RECEPTOR 
SIGNATURE 


PR00898H 11.34 4.682e- 
09 552-572 


1309 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM-BI. 


PD00301B 5.49 2.731e- 
09 390-401 


1310 


BL00983 


Ly-6 / u-PAR domain 
proteins . 


BL00983C 12.69 9.654e- 
13 73-89 BL00983B 
8.19 3.132e-09 12-22 


1313 


BL00194 


Thioredoxin family 
proteins. 


BL00194 12.16 1.900e- 
11 15-28 


1314 


3L00594 


Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 8.969e- 
10 53-97 


1316 


3L00134 


Serine proteases, 
trypsin family, 
histidine proteins. 


BL00134A 11.96 9.325e- 
13 128-145 


1320 


BL00783 


Ribosomal protein L13 
proteinB. 


BL00783C 22.43 6.559e- 
24 87-117 BL00783A 
14.55 1.600e-19 8-33 
BL00783B 12.76 3.500e- 
12 74-86 


1327 


PF00514 


Armadillo/beta- catenin- 
like repeat proteins. 


PF00514A 31.30 7.268e- 
11 82-120 


1329 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 6.294e- 
11 129-148 BL00030B 
7.03 4.789e-09 168-178 


1331 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P4 0 SIGNATURE 


PR00497A 6.92 7..239e- 
09 25-43 


1332 


PR00161 


NICKEL- DEPENDENT 
HYDROGENASE/B - TYPE 
CYTOCHROME SIGNATURE 


PR00161C 9.51 4.930e- 
09 317-337 


1333 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.769e- 
33 10-49 


1336 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 2.200e- 
09 262-281 


1337 


PR00700 


PROTEIN TYROSINE 


PR00700D 12.47 2.200e- 
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PHOSPHATASE SIGNATURE 


09 211-230 


1340 


PR00860 


VERTEBRATE 
METALLOTH I ONE I N 
SIGNATURE 


PR00860A 5.46 5.034e- 
13 5-18 


1341 


BL00893 


mutT domain proteins. 


BL00893 18 . 99 6 . 750e- 
16 46-71 


1343 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 5.974e- 
21 383-422 


1344 


DM00099 


4 kw A55R REDUCTASE 
TERMINAL 

DIHYDROPTERIDINE . 


DM00099B 14 . 73 8 .313e- 
09 417-427 


1345 


BL00923 


Aspartate and glutamate 
racemases proteins. 


BL00923B 11.41 5.935e- 
10 135-146 


1348 


PF00651 


BIB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 7.231e- 
13 44-57 


1350 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 3.571e- 
32 416-445 PR00193C 
12.60 6.318e-31 179- 

3.571e-24 133-159 
PR00193E 19.47 9.069e- 

9 9 DOnni Qlh 
H/V-lyy FKU013JA 

15.41 1.783e-20 77-97 


1352 


PR00447 


NATURAL RES I STANCE - 
ASSOCIATED MACROPHAGE 
PROTEIN SIGNATURE 


PR0Q447E 9.73 1.554e- 

15 299-319 PR00447D 

13.54 3.408e-15 200- 

224 PR00447A 12.73 

6.357e-ll 97-124 

it r\.u *± / ij o » o y J7.0//C 

10 353-373 ! 


1353 


BL00303 


S-100/ICaBP type calcium 
binding protein. 


BL00303A 21.77 6.667e- 
26 45-82 BL00303B 


1355 


BL0O039 


DEAD -box subfamily ATP- 
dependent helicases 
proteins . 


BL00039D 21.67 5.950e- 
29 375-421 BL00039A 

BL00039C 15.63 4.000e- 
18 225-249 BL00039B 
19.19 3.182e-14 141- 
167 


1357 


PF00615 


Regulator of G protein 
signalling domain 
proteins . 


PF00615B 16.25 2.216e- 
12 84-101 ppnnfii Rr 
10.06 8.412e-12 162- 
176 


1360 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 9.234e- 
29 10-49 


1361 


PR00925 


NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY 
SIGNATURE 


PR00925A 5.47 5.091e- 
18 14-29 PR00925B 
3.73 6.143e-14 29-42 
PR00925C 5.57 4.789e- 
12 53-64 PR00925D 
6.56 1.857e-10 76-87 


1362 


BL01272 


Glucokinase regulatory 
protein family proteins. 


BL01272B 19.61 6.870e- 
30 136-171 BL01272C 
11.68 3.314e-25 249- 
274 BL01272A 6.49 
1.231e-18 99-117 


1363 


BL01272 


Glucokinase regulatory 
protein family proteins. 


BL01272B 19.61 6.670e- 
30 113-148 BL01272C 
11.68 3.314e-25 226- 
251 BL01272A 6.49 
1.231e-18 76-94 


1364 


DM00179 


W KINASE ALPHA ADHESION 
T-CELL . 


DM00179 13.97 5.304e- 
09 167-177 


1368 


PR00169 


POTASSIUM CHANNEL 
SIGNATURE 


PR00169A 16.77 1.592e- 
09 76-96 


1370 


PR0098B 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 1 .794e- 
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err A t T> \zr\ . 


* /inn nnT /""iXT 

NO. 












10 1-19 


1371 


BL00242 


Integrins alpha chain 
proteins. 


BL00242B 8.13 8.615e- 
09 469-479 


13 72 


PK0Q62b 


nxTTi t DDHTT7 T M P1MTT.V 

SIGNATURE 


19 46-67 PR00625A 
12 84 1 391e-16 14-34 


1373 


BL00434 


HSF-type DNA-binding 
domain proteins. 


BL00434C 23 . 85 3 .778e- 


1374 


PR00962 


LETHAL (2) GIANT LARVAE 


PR0O962C 8.00 6.337e- 


1375 


PD02475 


MUCIN EPITHELIAL TUMOR- 


PD02475A 23.18 8.552e- 

XU J. J. J. J. - J. J. 3 VJ 


1376 


PD01066 


PROTEIN ZINC FINGER 
BINDING NU. 


PD01066 19.43 9.571e- 


13 80 


BIj00194 


Thioredoxir. family 
proteins . 


12 48-61 


13 81 


DM0197 0 


ENDOSOMAL III. 


U -1. i> ' U o o . o L> 1 .Sjoc - 

15 1123-1136 


1383 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 7.600e-10 
243 - 254 


1384 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 7.SOOe-10 
271-282 


1385 


BL00303 


S-100/ICaBP type calcium 
binding protein. 


BL00303B 26.15 o.203e- 
10 95-132 


1386 


BL01160 


Kinesin light chain 
repeat proteins. 


BLOlloUB ly . b4 b-L)42e- 
09 1574-1628 


1387 


BL00518 


Zinc finger, C3HC4 type 
{RING finger), proteins. 


BL00518 12.23 l.OOOe- 
11 52-61 


1389 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 3.600e- 
30 10-49 


1390 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 3.512e- 
31 32-71 


1392 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3.83 9.723e- 
10 127-137 


1393 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR0038OA 14.18 9.625e- 
25 88-110 PR00380D 
9.93 2.406e-20 304-326 
PR00380B 12.64 4.414e- 
16 208-226 PR00380C 
13;18 6.538e-16 243- 
262 


1394 


PD00066 


PROTEIN ZINC- FINGER 
METAL -BINDI . 


PD00066 13.92 3 .400e- 
14 462-475 PD00066 
13.92 8.800e-14 348- 
361 PD00066 13.92 

r» r* ^7 t ^ i i And a i q 
y.b/ie-12 40b-41B 

PD00066 13.92 6.087e- 

ii ^ju- juj ruuuuoo 

13.92 B.043e-ll 320- 

333 


139B 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 €.786e- 
32 10-49 


i a n n 


UMU -L U b 


PROTEIN. 


09 270-290 


1406 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930A 25.62 7.324e- 
15 363-389 


1407 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 7.500e- 
10 457-476 


1408 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019A 11.19 9.550e- 
11 179-193 PR00019A 
11.19 8.826e-10 228- 
242 PR00019B 11.36 
1.360e-09 199-213 
PR00019B 11.36 4.960e- 
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09 176-190 


1409 


PR00510 


NEBULIN SIGNATURE 


PR0C510A 9.09 4.150e- 
12 182-202 PR00510B 
12.96 8.767e-12 210- 
230 PR00510F 9.88 
8.172e-10 58-75 
PR00510D 9.21 2.367e- 
09 251-267 


1410 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PDOO078B 13.14 5.696e- 
09 31-44 


1412 


BL00358 


Ribosomal protein L5 
proteins . 


BL00358B 22.76 l.OOOe- 
40 57-103 BL00358C 
13.75 6.087e-14 122- 
136 BL00358D 14.26 
5.500e-13 143-158 
BL00358A 13.06 1.931e- 
11 33-44 


1414 


BL00282 


Kazal serine protease 
inhibitors family- 
proteins . 


BL00282 16.88 7.338e- 
10 511-534 


1415 


BL00023 


Type II fibronectin 
collagen -binding domain 
proteins . 


BL00023 24.31 4.300e- 
29 40-77 


1417 


PR0 0S81 


RIBOSOMAL PROTEIN SI 
SIGNATURE 


PR00681G 12.54 2 . 149e- 
09 38-60 


1418 


DM00973 


3 Jew RESISTANCE BENOMYL 
YLL028W CYCLOHEXIMIDE. 


DM00973A 21.17 1.462e- 
09 171-208 


1419 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 1.571e- 
09 428-443 


1420 


PD01941 


TRANSMEMBRANE 
COTRANS PORTER SYMP. 


PD01941A 14.81 l.OOOe- 
40 142-196 PD01941B 
15.02 7.049e-30 400- 
447 PD01941E 15.92 
2.475e-20 817-864 
PD01941C 19.96 3.118e- 
19 488-543 PD01941D 
27.18 9.614e-18 641- 
690 PD01941F 28.52 
5.382e-15 1038-1093 


1422 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 8.043e- 
12 199-217 


1423 


PR00209 


ALPHA/ BETA GLIADIN 
FAMILY SIGNATURE 


PR00209B 4 . 88 6.3l8e- 
11 1009-1028 


1424 


BL50002 | 


Src homology 3 (SH3) 
domain proteins profile. 


BL50002A 14.19 8.200e- 
14 367-386 BL50002A 
14.19 9.250e-12 298- 
317 BL50002A 14.19 
4.462e-ll 208-227 
BL50002B 15.18 l.OOOe- 
09 244-258 


1425 


PF00628 


PHD- finger . 


PF00628 15.84 3.045e- 
12 330-345 


1426 


PF00628 


PHD- finger . 


PF00628 15.84 3.045e- 
12 377-392 


1427 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 5.114e- 
16 281-299 PR00405A 
17.71 4.306e-14 262- 
282 


1428 


BL00039 


DEAD -box subfamily ATP- 
dependent hel leases 
proteins . 


BL00039D 21.67 5.219e- 
34 147-193 


1429 


PRO 03 20 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 8.920e- 
10 577-592 


1430 


PR00378 


INOSITOL PHOSPHATASE 
SIGNATURE 


PR00378D 16.86 7.563e- 
12 295-314 PR00378B 
13.80 8.650e-10 166- 
186 


1431 


PR0D928 


GRAVES DISEASE CARRIER 


PR00928B 13 .53 3 . 769e- 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PROTEIN SIGNATURE 


10 103-124 


1433 


BL01113 


Clq domain proteins. 


BL01113B 18.26 7.049e- 
15 14-50 BL01113C 
13.18 7.000e-12 82-102 


1434 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 7.983e- 
10 135-150 


1436 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 l.OOOe- 
12 84-103 


1438 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 2.500e- 
09 250-268 BL00290A 
20.89 4 .000e-09 188- 
211 


1440 


PR00806 


VINCULIN SIGNATURE 


PR008C6B 4.28 4.960e- 
09 38-52 


1441 


PROD 806 


VINCULIN SIGNATURE 


PR00806B 4.28 4.960e- 
09 88-102 


1444 


BL00422 


Granins proteins. 


BL00422D 19.48 l.OOOe- 
08 114-138 


1445 


PD01341 


PHOS PHOR YLAS E KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe- 
40 73-123 PD01841B 
14.35 l.OOOe-40 144- 
185 PD01841D 17.87 
l.OOOe-40 206-258 | 
PD01841F 13.36 l.OOOe- 
40 296-345 PD01841G 
24.26 l.OOOe-40 349- 
403 PD01841I 23.00 
l.OOOe-40 494-536 
PD01841J 14.94 l.OOOe- 
40 895-932 PD01841L 
18.42 l.OOOe-40 1083- 
1125 PD01841E 18.60 
9.719e-38 258-296 
PD01841K 14.81 l.OOOe- 
35 1041-1071 PD01841H 
21.30 3.189e-31 435- 
472 PD01841C 13.78 
1.000e-25 185-206 
PD01841M 10.82 1.250e- 
20 1175-1194 


1446 


PF00816 


H-NS histone -family. 


PF00816B 13.84 8.875e- 
09 190-220 


1447 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 2.080e- 
09 402-416 


1448 


DMO0315 


0 72 RIBONUCLEASE 
INHIBITOR . 


DM00315D 18.40 7.393e- 
09 23-67 


1451 


BL00030 


Eukaryotic RNA-binding 
region RNP-l proteins. 


BL00030B 7.03 2.800e- 
10 94-104 


1454 


DM01688 


2 POLY-IG RECEPTOR. 


DM01688D 13.44 7.146e- 
09 382-405 


1455 


PF00777 


Sialyl transferase 
family. 


PF00777C 18.60 2 . 929e- 
22 4-59 


1457 


BL00927 


Trehalase proteins. 


BL00927C 10.83 8.085e- 
09 42-53 


1460 


BL00545 


Aldose 1-epimerase 
proteins. 


BL00545C 11.28 7.353e- 
17 169-182 BL00545A 
10.20 2.071e-15 73-89 
BL00545B 13.10 3.942e- 
09 140-153 


1466 


PR00097 


ANTHRANILATE SYNTHASE 
COMPONENT II SIGNATURE 


PR00097C 9.42 9.069e- 
09 233-245 


1472 


BL01129 


Hypothetical 
yabO/yceC/sfhB family 
proteins . 


BL01129E 13.25 5.250e- 
22 170-195 BL01129C 
25 . 56 9.526e-18 63-106 


1473 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.B21e- 
09 2114-2145 


1475 


PF00686 


Starch binding domain 
proteins . 


PF00686A 13.45 9.100e- 
09 267-277 
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NO. 


DESCRIPTION 


RESULTS * 


1477 


PF00566 


Probable rabGAP domain 
proteins . 


PF00566A 12.64 7.333e- 
10 466-476 


1478 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


10 43-53 


1479 


DM00406 


GLIADIN . 


nvioni ofi 7 71 a 54 le- 1 0 
292-305 


1480 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins . 


BL00290B 13.17 2.385e- 
20.89 5.091e-ll 12-35 


1481 


PR00150 


PHOS PHOENOLP I KU VA l c. 
CARBOXYLASE SIGNATURE 


ppnm cifiP in 9 039e- i 

jrrvv ui jur j_ u . *± ~> • \j 

09 21-51 


1482 


PF00780 


Domain found in NIK1- 
like kinases, mouse 
citron and yeast ROM. 


PF00780I 14.69 4.825e- 


1483 


BL01160 


Kinesin light chain 
repeat proteins . 


BL01160B 19.54 1.153e- 
09 108-162 


1485 


PD01066 


PROTEIN 1 ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 5.909e- 
25 17-56 


1486 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 1.529e- 
09 34-50 


1486 


BL0003S 


DEAD -box subfamily ATP- 
dependent helicases 
proteins . 


BL00039D 21.67 9.586e- 
10 116-162 


1490 


BL00166 


Enoy 1 - CoA 

hy dr a t a s e / i s ome r a s e 
proteins . 


BL00166D 22.87 2.607e- 
24 190-226 BL00166C 
18.93 5.500e-14 140- 
167 BL00166B 16.92 
9.357e-ll 93-115 


1491 


BL00452 


Guanylate cyclases 
proteins . 


BL00452D 28.59 3.70Ue- 
31 63-106 BL00452E 

11.92 3.U4DC-1J 113" 
131 


1492 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PROOUiyA 11. iy J.oo/e- 
09 532-546 


1497 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107B 13.31 l.OOOe- 
11 384-400 BL00107A 
18.39 5.345e-ll 322- 
353 


1500 


PF00876 


Ogre family. 


PF00876E 7.99 1.947e- 

1U 1U /- 11 / 


1502 


BL00027 


1 Homeobox 1 domain 
proteins . 


BL00027 26.43 4.789e- 
24 112-155 


1503 


BL00027 


• Homeobox 1 doma in 
proteins . 


BL00027 26.43 4.789e- 
24 112-155 


1505 


BL01177 


Anaphy la toxin domain 
proteins . 


BL01177E 20.64 5.800e- 
24 448-475 BL01177C 
17.39 5.333e-19 402- 
421 BL01177B 13.61 
7.840e-16 155-171 
BL01177D 17.50 1.900e- 
15 427-445 


1506 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins . 


BL00972D 22.55 5.500e- 

1 A 111 ii/ QTnflQ*7"3Ii 
14 Jll-JJb BliUUy / 4&J\ 

11.93 7.429e-14 48-66 
BL00972E 20.72 8.759e- 
10 341-363 


1512 


BL00523 


Sulfatases proteins. 


BL00523E 19.27 4.536e- 
22 76-106 BLO0523D 
9.89 1.563e-ll 40-52 
BL00523F 10.85 4.162e- 
09 159-170 BL00523G 
9.46 5.333e-09 256-266 


1516 


BL00914 


Syntaxin / epimorphin 
family proteins. 


BL00914 24.91 7.045e- 
14 168-218 


1518 


BL00600 


Aminotransferases class- 
Ill pyridoxal -phosphate 
attachment si. 


BL00600A 17.98 6.143e- 
19 98-122 BL00600E 
16.43 1.771e-17 302- 
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ACCESSION 
NO. 
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331 BL00600G 12.43 
9.625e-17 377-396 
BL00600B 19.60 5.091e- 
15 160-186 BL00600C 
16.18 6.040e-12 190- 
206 BL00600F 8.77 
1.000e-ll 343-356 
BLO06O0D B.71 l.OOOe- 
10 281-295 


1523 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 9.600e- 
18 41-82 


152 8 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320B 12.19 4.774e- 
11 192-207 PR0032OB 
12.19 8.839e-ll 272- 
287 PR00320B 12.19 
9.743e-10 106-121 
PR00320A 16.74 1.878e- 
09 192-207 PR00320A 
16.74 2.317e-09 106- 
121 PR00320A 16.74 
8.683e-09 272-287 
PROO320C 13.01 8.800e- 
09 106-121 


1538 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 


DMO1970B 8.60 4.508e- 
15 171-184 


153 9 


PF00781 


Diacylglycerol kinase 
catalytic domain 
proteins {presumed) . 


PF00781D 11.11 7.593e- 
10 103-127 


1540 


PR0096S 


OCULAR ALBINISM TYPE 1 
PROTEIN SIGNATURE 


PR00965H 10.73 1.231e- 
29 312-334 PR00965E 
12 91 5 846e-29 175- 
195 PR00965F 5.98 
1.123e-28 209-231 
PR00965C 15.04 l.OOOe- 
27 131-151 PR00965D 
5.84 1.000e-27 150-170 
PR00965G 8.52 2.440e- 
27 258-279 PR00965B 
4. 80 8,650e-26 88-109 
PR00965A 12.52 l.OOOe- 
25 35-55 PR009g5I 
3 . 91 6 .442e-25 385-406 


1541 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL01013D 26.81 9.719e- 
17 163-207 


1543 


PD02699 


PROTEIN DNA- BINDING 
BINDING DNA. 


PD02699C 24.84 l.OOOe- 
40 599-646 PD02699A 
8.91 2.286e-34 219-248 
PD02699B 18.28 6.143e- 
21 485-509 


1544 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00O49D 0.00 7.857e- 
10 182-197 PR00049D 
0.00 7.102e-09 67-82 


1547 


BL00951 


ER lumen protein 
retaining receptor 
proteins. 


BL00951C 19.35 l.OOOe- 
40 93-142 BL00951D 
13.94 8.714e-40 142- 
177 BL00951A 15.10 
1.000e-38 2-38 . 
BL00951B 14.23 6.250e- 
33 38-69 


1548 


BL00536 


Ubiqui tin- activating 
enzyme proteins. 


BL00536F 13.65 8.920e- 
30 279-318 BL00536D 
22.91 5.737e-24 21-65 
BL00536E 16.94 4.696e- 
18 248-279 


1549 


PR00139 


ASPARAGINASE /GLUT AM IN ASE 
FAMILY SIGNATURE 


PR00139C 11.72 9.679e- 
09 550-569 


1553 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 5.119e- 
09 58-73 
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NO. 
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1556 


BL00061 


Short-chain 

dehydrogenases /reductase 
s family proteins. 


BL00061B 25.79 6 . 276e- 
13 67-105 


1557 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.105e- 
12 107-132 


1558 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.105e- 
12 107-132 


1559 


BL01228 


Hypothetical cof family 
proteins. 


BL01228D 17.44 8.105e- 
12 107-132 


1562 


BL00522 


DNA polymerase family X 
proteins . 


BL00522C 11.90 6.600e- 
18 412-436 BL00522B 
27.30 1.738e-16 364- 
410 BL00522A 25.52 
6.000e-16 279-326 
BL00522E 19.63 6.123e- 
14 502-532 BL00522F 
14.90 2.385e-13 551- 
575 


1563 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 1.947e- 
11 46-59 


1564 


BL00299 


Ubiquitin domain 
proteins . 


BL00299 28.84 2.823e- 
10 324-376 


1566 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL01013D 26.81 8.594e- 
17 184-228 BL01013C 
9.97 4 .906e-12 14-24 


1567 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 3 .400e-10 
378-389 BL00678 9.67 
5.800e-10 418-429 
BL00678 9.67 8.800e-10 
295-306 


1570 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins . 


BL00479B 12.57 5.235e- 
17 297-313 BL00479A 
19.86 6.625e-15 271- 
294 BL00479A 19 . 86 
2.667e-14 147-170 
BL00479B 12.57 6 . 294e- 
12 173-189 


1576 


PR00665 


OXYTOCIN RECEPTOR 
SIGNATURE 


PR00665G 12.36 4.673e- 
24 364-384 PR00665D 
9.93 1.200e-22 138-155 
PR00665F 11.73 4.000e- 
22 337-354 PR00665C 
5.89 1.000e-20 65-80 
PR00665B 5.29 4.337e- 
19 24-39 PR00665E 
5.60 2.929e-15 246-260 
PR00665A 5.99 5.622e- 
15 11-25 


1577 


DM00099 


4 kw A55R REDUCTASE 
TERMINAL 

DIHYDROPTERIDINE . 


DM00099B 14.73 9.308e- 
10 127-137 


1579 


BL00524 


Somatomedin B domain 
proteins . 


BL00524A 9.65 6.776e- 
14 52-73 


1580 


PD02B94 


HYDROLASE N4 - PRECURSOR 
PROTEIN SIGNAL BE. 


PD02894B 13.93 6.959e- 
16 182-215 PD02894A 
21.96 2.125e-10 57-103 


1581 


BL0 0411 


Kinesin motor domain 
proteins . 


BL00411C 15.04 5.292e- 
12 32-54 BL00411H 
15.66 4.441e-ll 245- 
276 


1582 


PR00604 


CLASS I A AND IB 
CYTOCHROME C SIGNATURE 


PR00604A 11.13 2.440e- 
09 79-87 


1584 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 l.OOOe- 
10 225-238 


1585 


DM01551 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER . 


DM01551C 14.62 9.455e- 
11 125-145 


15B6 


DM01354 


kw TRANSCRIPTASE REVERSE 
II 0RF2. 


DM01SS4S 11.61 7.750e- 
09 474-495 
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1587 



PR00072 



MALIC ENZYME SIGNATURE 



PR00072B 13.77 7.955e- 
33 180-210 PR00072A 
12.75 6.040e-25 120- 
145 PR00072C 11.42 
2.2B6e-24 216-239 
PR00072D 10.77 3.400e- 
22 276-295 PR00072E 
10.54 1.360e-19 301- 
318 PR00072G 10.45 
5.304e-19 433-450 
PR00072F 8. 87 5.935e- 
15 332-349 



BL00191H 15.64 l.S37e- 
22 61-113 BL00191K 
17.38 9.027e-12 398- 
442 



158T 



BL00191 



Cytochrome b5 family, 
heme -binding domain 
proteins. 



1590 



DM01970 



0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 



DM01970B 8.60 7.716e- 
13 211-224 DM01970B 
8.60 2.157e-12 94-107 



1591 



DM00517 



5 kw NUCLEAR 60 . 7 NUP1 
CHROMOSOME . 



DM00517B 10.96 6.625e- 
16 1175-1193 DM00517A 
8.21 1.000e-ll 1015- 
1026 



1592 



BL00037 



Myb DNA- binding domain 
proteins repeat proteins 
proteins . 



BL00037B 15.92 3.250e- 
27 116-142 BL00037A 
16.68 2.500e-24 83-107 
BL00037A 16.68 3.250e- 
12 31-55 BL00037B 
15.92 3.526e-ll 64-90 
BL00037C 16.86 9.654e- 
10 146-164 



BL00028 16.07 1.5146 
09 110-127 



1595 



BL00028 



Zinc finger, C2H2 
domain proteins. 
PHD- finger. 



type, 



PF00628 15.84 3.250e- 
11 1667-1682 



1598 



PF00628 



1599 



PR00014 



FIBRONECTIN TYPE III 
REPEAT SIGNATURE 



PR00014D 12.04 5.500e- 
09 980-995 



BL00518 12.23 6.571e- 
10 30-39 



1600 



BL00518 



Zinc finger, C3HC4 type 
(RING finger) , proteins . 



Neuromodulin (GAP-43) 
proteins . 



1602 



BL00412 



BL00412D 16.54 5.402e- 
10 136-187 



1605 



PF00651 



1607 



BL00252 



BTB (also known as BR- 
C/Ttk) domain proteins. 



Interferon alpha, beta 
and delta family 
proteins . 



PF00651 15.00 3.571e- 

10 44-57 

BL00252A 18.49 6.657e- 
23 20-57 BL00252B 
19.78 9.125e-16 58-109 



1610 



DM00215 



PROLINE-RICH PROTEIN 3. 



DM00215 19.43 l.OOOe- 
08 61-94 



BL00904C 8.98 7.353e- 
10 91-125 BL00904D 
1.47 6.018e-09 127-168 



1611 



BL00904 



Protein 

prenyl transferases alpha 
subunit repeat proteins 

proteins . 

C2 domain proteins. 



1612 



PF00168- 



1613- 



BL00412 



PF00168C 27.49 3.250e- 
09 365-391 



Neuromodulin (GAP-43) 
proteins . 



BL00412D 16.54 6.051e- 
09 932-983 BL00412D 
16.54 7.153e-09 933- 
984 



1614 



BL00559 



Eukaryotic molybdopterin 
oxidoreductases 
proteins . 



BL00559I 13.63 3.531e- 
25 54-83 BL00559K 
13.17 2.957e-18 197- 
224 BL00559J 19.63 
6.870e-l6 124-176 
BL00559L 13 . 60 9.000e- 
16 266-284 



1615 



PD01427 



TRANSFERASE 
METHYLTRANS FERASE BI . 



PD01427B 22.45 3.025e- 
22 S00-541 PD01427A 
19.94 8.773e-18 439- 



239 



WO 01/53312 



PCTYUS00/34263 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








472 


1616 


BL00115 


Eukaryotic RNA 
polymerase II 
heptapeptide repeat 
proteins . 


BL00115Z 3.12 7.4B5e- 
09 152-201 BL00115Z 
3.12 9.603e-09 145-194 


1617 


BL00303 


S-100/ICaBP type calcium 
binding protein. 


BL00303B 26.15 7.750e- 
32 51-88 BL00303A 
21.77 1.947e-31 4-41 


1618 


BL01254 


Fetuin family proteins. 


BL01254F 10.02 8.754e- 
09 137-147 


1619 


PD01888 


PEPTIDE REDUCTASE 
PROTEIN METHI . 


PD01888B 25.10 l.OOOe- 
40 47-97 PD01888C 
21.56 7.000e-30 125- 
155 PD01888A 12.84 
8.800e-15 7-23 


1621 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 3.455e- 
09 692-704 PR00239E 
1.58 4.580e-09 697-709 
PR00239E 1.58 4.580e- 
09 702-714 PR00239E 
1.58 5.193e-09 703-715 


1622 


PR0O86O 


VERTEBRATE 

METALLOTHIONEIN 

SIGNATURE 


PR00860B 7.04 1.900e- 
18 27-41 PR00860C 
9.61 1.474e-14 41-51 
PR00860A 5.46 1.720e- 
14 5-18 


1624 


PR00784 


MITOCHONDRIAL BROWN FAT 
UNCOUPLING PROTEIN 
SIGNATURE 


PR00784D 15.86 8.027e- 
11 77-95 


1626 


BL00325 


Actin-depolymerizing 
proteins . 


BL00325B 21.66 l.OOOe- 
40 93-139 BL00325A 
24.83 6.786e-23 61-93 


1631 


BL00064 


L- lactate dehydrogenase 
proteins . 


BL00064B 23.57 l.OOOe- 
40 82-130 BL00064C 
17.28 1.000e-40 137- 
182 BL00064E 27.20 
1.000e-40 223-275 
BL00064F 25.14 7.882e- 
36 286-331 BL00064A 
21.16 1.000e-33 22-60 
BL00064D 14.19 6.500e- 
31 182-212 


1632 


PR00063 


RIBOSOMAL PROTEIN L27 
SIGNATURE 


PR00063B 15.24 9.700e- 
11 59-84 PR00063A 
11.71 1.614e-09 34-59 


1634 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239D 0.00 l.lOSe- 
11 36-49 PR00239C 
3.51 2.538e-09 37-45 


1636 


BL01210 


Caveolins proteins. 


BL01210B 13.92 9.531e- 
10 133-183 


1«7 


BL00982 


Bacterial -type phytoene 
dehydrogenase proteinB. 


BL00982A 18.41 5.388e- 
11 11-43 


1639 


BL01183 


ubiE/COQ5 

methyltransf erase family 
proteins . 


BL01183B 21.31 8.144e- 
12 132-177 


1640 


PR00015 


GRAM- POSITIVE COCCUS 
SURFACE PROTEIN ANCHOR 
SIGNATURE 


PR00015B 9.84 8.468e- 
10 128-149 


1641 


PR00320 


G-PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320B 12.19 5.935e- 
11 364-379 PR00320A 
16.74 7.828e-ll 364- 
379 PR00320C 13.01 
2.800e-10 279-294 
PR00320C 13 .01 2.600e- 
10 364-379 PR00320B 
12.19 5.114e-10 279- 
294 PR00320A 16.74 
1.659e-09 279-294 
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ACCESSION 
NO. 



PF00023 



PR00169 



DESCRIPTION 



Ank repeat proteins. 



POTASSIUM CHANNEL 
SIGNATURE 



RESULTS* 



PRO0320A 16.74 2.098e- 
09 229-244 



PF00023A 16.03 6.464e ; 
09 114-130 



PR00169A 16.77 1.806e- 
11 74-94 



BL00678 



BL01108 



PR0038C 



DM01242 



Trp-Asp (WD) repeat 
proteins proteins. 



Ribosomal protein L24 
proteins . 



KINESIN HEAVY CHAIN 
SIGNATURE 



3 THREONINE- -TRNA 
LIGASE . 



BL00678 9.67 2.200e-10 
109-120 BL00678 9.67 
5.737e-09 528-539 



BL01108A 20.33 7.366e- 
17 56-89 



PR00380A 14.18 9.270e- 
21 103-125 PR00380D 
9.93 6 .30Be-18 386-408 
PR00380C 13.18 7.923e- 
16 332-351 PR00380B 
12.64 6.657e-15 292- 
310 



DM01242C 17.15 9.791e- 
37 340-381 DM01242E 
23.00 5.071e-31 463- 
505 DM01242D 23.29 
3 .925e-30 420-463 
DM01242B 23.57 B.054e- 
18 265-314 DM01242F 
10.61 7.618e-14 526- 
540 



PD00126 



BL01160 



BL00795 



BL00982 



PROTEIN REPEAT DOMAIN 

TPR NUCLEA 

Kinesin light chain 
repeat proteins. 



PD00126A 22.53 5.500e- 
10 13-34 



BL01160B 19.54 6. 
11 431-485 



720e 



BL00933 | FGGY family of 

carbohydrate kinases 
proteins . 



BL00933A 17.50 4.673e- 
12 11-35 BL00933E 
13.80 9.217e-09 456- 
4 72 



Involucrin proteins. 



BL00795C 17.06 2.988e- 
10 70-115 



Bacterial- type phytoene 
dehydrogenase proteins. 



BL00982A 18.41 7.750e- 
17 302-334 



BL00982 



BL00741 



Bacterial- type phytoene 
dehydrogenase proteins. 



BL00982A 18.41 7.750e- 
17 282-314 



Guanine- nucleotide 
dissociation stimulators 
CDC24 family sign. 



BL00741B 14.27 1.3916 
16 607-630 



PR00449 



TRANSFORMING PROTEIN P21 
RAS SIGNATURE 



PR00449A 13.20 7.93Be- 
11 114-136 



1658 



1659 



1660 



1661 



1662 



PR00910 



BL00972 



BL00406 



PR00105 



BI.002B0 



LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 



PR00910A 2 . 51 8 . 889e- 
10 442-455 



Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins . 



BL00972D 22.55 4.140e- 
12 376-401 BL00972E 
20.72 5.629e-09 446- 
468 



Actins proteins. 



BL00406D 12.58 8.767e- 
15 188-243 



CYTOSINE- SPECIFIC DNA 

METHYLTRANSFERASE 

SIGNATURE 



PR00105A 10.36 4.900e- 
13 1140-1157 PR00105B 
12.32 2.800e-12 1259- 
1274 PR00105C 10 .86 
1.000e-10 1305-1319 



Pancreatic trypsin 
inhibitor (Kunitz) 
family proteins. 



BL00280 24.61 3.172e- 
33 3119-3163 



PR00319D 11.64 6.625e- 
23 107-125 PR00319C 
13.41 5.714e-20 89-105 
PR00319A 15.27 5.286e- 
19 51-68 PR00319B 
11.47 8.200e-19 70-85 



1663 



PR00319 



BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1664 


BL00018 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 5.050e-10 
489-502 


1667 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 8.500e- 
38 7-46 


1669 


BL01153 


NOLl/NOP2/sun family 
proteins. 


BL01153D 19.69 1.186e- 
17 115-141 BL01153C 
13 .67 8.977e-15 66-80 
BL01153B 20.52 1.885e- 
10 13-37 


1671 


PROOFS 


PI3 KINASE P85 
REGULATORY SUB UN IT 
SIGNATURE 


PR0067BH 9 .13 3 .100e- 
10 1146-1169 


1672 


BL00598 


Chromo domain proteins . 


BL00598 14.45 8.500e- 
20 27-49 


1673 


PR00326 


GTP1/OBG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 B.329e- 
09 686-707 


1674 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.580e- 
11 343-358 PR00049D ' 
0.00 1.286e-10 342-357 


1676 


PR00747 


GLYCOSYL HYDROLASE 
FAMILY 4 7 SIGNATURE 


PR0074 7H 12.76 8,636e- 
19 427-448 PR00747G 
14.50 2.286e-16 368- 
393 PR00747C 12.06 
7.500e-18 112-131 
PR00747A 14.05 4.600e- 
17 42-63 PR00747D 
15.23 8.759e-17 163- 
183 PR00747E 15.13 
8.244e-15 254-272 
PR00747B 7.65 5.355e- 
13 75-90 PR00747F 
13 .56 8 . 714e-10 311- 
328 


1677 


PR00747 


GLYCOSYL HYDROLASE 
FAMILY 47 SIGNATURE 


PR00747H 12.76 8.636e- 
19 309-330 PR00747G 
14.50 2.286e-18 250- 
275 PR00747C 12.06 
7.500e-18 112-131 
PR00747A 14.05 4.600e- 
17 42-63 PR00747B 
7.65 5.355e-13 75-90 
PR00747F 13.56 8.714e- 
10 193-210 


1680 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 4.600e-10 
406-417 BL00678 9.67 
6.684e-09 320-331 


1681 


BL00678 


Trp -Asp ( WD ) repea t 
proteins proteins. 


BL0067B 9.67 4.600e-10 
329-340 BL00678 9.67 
6.684e-09 243-254 


1683 


PR00326 


GTP1/OBG GTP -BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 1.346e- 
13 389-410 


1685 


PR00646 


RDC1 ORPHAN RECEPTOR 
SIGNATURE 


PR00646H 6-. 32 4.188e- 
09 755-771 


1690 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 6.644e- 
09 75-129 


1691 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 7.281e- 
10 418-433 PR00456E 
3.06 7.281e-10 419-434 
PRO0456E 3.06 8.125e- 
10 420-435 


1692 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 7.28le- 
10 487-502 PR00456E 
3.06 7.281e-10 488-503 
PR00456E 3.06 8.125e- 
10 489-504 


1693 


BL00674 


AAA-protein family 
proteins . 


BL00674C 22.60 8.043e- 
24 274-317 BL00674B 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* ~ " 








4.46 4.000e-23 241-263 
BL00674D 23.41 8.560e- 
18 338-385 BL00674E 
15.24 1.720e-15 414- 
434 


1697 


PR00409 


PHTHALATE DIOXYGENASE 
REDUCTASE FAMILY 
SIGNATURE 


PR00409F 12.70 4.388e- 
10 427-447 


1698 


PR00466 


CYTOCHROME B-245 HEAVY 
CHAIN SIGNATURE 


PR00466C 10.17 3.443e- 
13 187-208 PR00466B 
5.03 5.500e-ll 162-186 
PR00466F 9.16 6.159e- 
09 498-517 


1699 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 9.217e- 
12 283-300 BL00028 
16.07 3.769e-ll 255- 
272 BL00028 16.07 
5.154e-ll 171-188 
BL00028 16.07 5.500e- 
11 227-244 BL00028 
16.07 1.600e-10 199- 
215 


1700 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01C19A 13.20 3.348e- 
15 62-102 BL01019B 
19.49 4.000e-15 107- 
162 


1703 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 2.484e- 
12 200-239 


1707 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR001C9B 12.27 4.558e- 
14 134-153 


1710 


PR00019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR00019A 11.19 2.565e- 
10 116-130 PR00019B 
11 .36 4 . 600e-09 113- 
127 PR00019B 11.36 
7.120e-09 204-218 


1711 


BL01159 


WW/rep5/WWP domain 
proteins . 


BL01159 13.85 6.523e- 
11 232-247 BL01159 
13.85 5.408e-10 613- 
628 


1712 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 7.000e- 
10 187-203 


1713 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type {and similar) . 


PF00642 11.59 9.550e- 
11 230-241 


1714 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 9.550e- 
11 230-241 


1715 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 7.129e- 
09 7-51 


1718 


BL00353 


HMG1/2 proteins. 


BL00353C 14.83 6.018e- 
10 136-183 BL00353B 
11.47 8.866e-09 86-136 


1719 


BL00412 


Neuromodulin {GAP-43) 
proteins . 


BL00412D 16.54 5.4&8e- 
09 432-483 


1721 


BL00038 


Myc-type, • helix-loop- 
helix 1 dimerization 
domain proteins. 


BL00038B 16.97 8.448e- 
12 79-100 BL00038A 
13.61 4.000e-ll 52-6B 


172 3 


PD00567 


PROTEIN RNA- BINDING RNA 
REPEAT HYD. 


PD00567C 9.17 8.500e- 
09 41B-428 


1724 


BL01279 


Protein-L- 
isoaspartate (D- 
aspartate) 0- 
methyltransf erase signa. 


BL01279A 24.27 5.663e- 
12 233-281 


1728 


BL00018 


EF-hand calcium-binding 
domain proteins . 


BL00018 7.41 2.059e-ll 
73-86 ' BL00018 7.41 
4.1768-11 157-170 


1730 


BL0O594 


Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 1.0B9e- 
09 17-61 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1731 


BL01160 


Kinesin light chain 
repeat proteins . 


BLO1160B 19.54 9.676e- 
10 296-350 


1732 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 9.676e- 
10 316-370 


1733 


PF00850 


Histone deacetylase 
family. 


PFO0850F 15.70 4.349e- 
22 246-279 PF00850D 
14.76 6.850e-20 177- 
201 PF00850E 8.88 
8.691e-18 209-235 
PF00850G 22.75 4.098e- 
14 281-323 


1734 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 5.932e- 
09 292-307 


1735 


DM00179 


w KINASE ALPHA ADHESION 
T - CELL . 


DM00179 13.97 5.263e- 
10 492-S02 


1743 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13 .20 1.188e- 
11 5-27 PR00449D 
10.79 2.241e-10 109- 
123 PR00449E 13.50 
9.289e-10 144-167 


1744 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.188e- . 
11 5-27 PR00449D 
10.79 2.241e-10 109- 
123 PR00449E 13.50 
9.289e-10 144-167 


1745 


BL00720 


Guanine -nucleotide 
dissociation stimulators 
CDC25 family sign. 


BL00720B 16.57 8.297e- 
15 136-160 


1746 


PR00081 


GLUCOSE/RIB ITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.727e- 
11 45-57 PR00081E 
17.54 3 .935e-10 150- 
168 


1747 


BL00439 


Acyl transferases 
ChoActase / COT / CPT 
family proteins. 


BL00439H 18.24 8.435e- 
14 65-91 BL00439G 
13 .40 2 . 895e-12 3-14 


1749 


PR00819 


CBXX/CFQX SUPERFAMILY 
SIGNATURE 


PR00819B 10.83 7.158e- 
11 4-20 


1751 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 3.400e- 
14 33-46 PD00066 
13.92 1.000e-13 89-102 
PD00066 13.92 7.000e- 
13 61-74 PD00066 
13 .92 6 .571e-12 117- 
130 


1753 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL01013D 26.81 6.516e- 
18 33-77 


1754 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.393e- 
09 490-521 BL00790I 
20.01 2.821e-09 60-91 
BL00790I 20.01 6.357e- 
09 287-318 


1756 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.750e- 
35 10-49 


1758 


DM00406 


GLIADIN. 


DM00406 7.73 7.600e-09 
653-666 


1762 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 4.529e- 
09 224-278 


1765 


PR00326 


GTP1/0BG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 5.950e- 
11 146-167 


1775 


PF00023 


Ank repeat proteins . 


PF00023A 16.03 3 . 077e- 
14 523-539 


1776 


BL00942 


glpT family of 
transporters proteins. 


BL00942F 15. 07 4 .343e- 
10 371-389 BLO0942B 
20.36 8.040e-09 94-137 


1777 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.373e- 
09 279-312 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1778 


BL00084 


Copper type II, 
ascorbate - dependent 
monooxygenases proteins. 


BL00084D 25.11 3.700e- 
20 169-224 BL00084B 
24.26 8.134e-16 10-58 
BL00084C 27.71 8.412e- 
11 107-158 


1779 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL01013D 26.81 3.758e- 
18 611-655 BL01013A 
25.14 2.8Ble-l5 344- 
380 BL01013C 9.97 
6.308e-13 435-445 
BL01013B 11.33 3.717e- 
12 409-420 


1783 


BL00741 


Guanine-nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 


1784 


BL00741 


Guanine- nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 



* results include in order: accession number subtype; raw score; p- value; postion of 
signature in amino acid sequence. 
TRADOCS: 1 4 1 6223 . 1 (%CRJ0 1 ! . DOC) 
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TABLE 4 



SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


2 




Immunoglobulin domain 


2 . le-32 


109 .5 


3 


pkinase 


Eukaryotic protein kinase 
domain 


1 . 3e-29 


110 . 7 


4 


zf -C2H2 


Zinc finger, C2H2 type 


1 . 6e-21 


— — x 

84 . 9 


5 j 


fn3 


Fibronectin type III domain 


0 


1097.1 


6 


fn3 


Fibronectin type III domain 


0 


1035 .0 


7 


fn3 


Fibronectin type III domain 


0 


1090.4 


8 


fn3 


Fibronectin type III domain 


0 


1097 .1 


9 


TBC 


TBC domain 


4e-40 


146.7 


10 


p450 


Cytochrome P450 


9.5e-17 


62.0 


12 


ank 


Ank repeat 


6e-20 


79.7 


14 


ig 


Immunoglobulin domain 


1.7e-05 


22 .7 


15 


zf-MYND 


MYND finger 


1.3e-06 


35.4 


IS 


zf-MYND 


MYND finger 


1.3e-06 


35.4 


17 


zf-C2H2 


Zinc finger, C2H2 type 


1.7e-99 


343 .9 


18 


CAP_GLY 


CAP-Gly domain 


1.2e-25 


98 .7 


20 


IMPDH_C 


IMP dehydrogenase / GMP 
reductase C terminus 


1. 6e-119 


410.5 


21 


IMPDH_C 


IMP dehydrogenase / GMP 
reductase C terminus 


4 .3e-102 


352.6 


22 


pkinase 


Eukaryotic protein kinase 
domain 


2 .4e-79 


277.0 


23 


pkinase 


Eukaryotic protein kinase 
domain 


8 .4e-74 


258.6 


25 


RNA_pol A 


RNA polymerase alpha subunit 


0 


1077.7 


26 


Clq 


Clq domain 


1 .9e-10 


44.4 


27 


Ribosomal_L2 

3 


Ribosomal protein L23 


7. 8e-32 


111.2 


28 


Ribosomal L2 
3 


Ribosomal protein L23 


le-29 


104.2 


30 


zf-A20 


A20-like zinc finger 


1 . 5e-10 


48.5 


31 


zf-A20 


A20-like zinc finger 


1 . 5e-l0 


48.5 


32 


FMN_dh 


FMN- dependent dehydrogenase 


5.4e-179 


608.1 


34 


PID 


Phospho tyrosine interaction 
domain (PTB/PID) 


3.8e-59 


209.9 


35 




Immunoglobulin domain 


I.4e-13 


48.8 


36 


ig 


Immunoglobulin domain 


1 .4e-13 


48.8 


40 


kinesin 


Kinesin motor domain 


6.7e-76 


265.6 


44 


Ets 


Ets- domain 


1 .4e-56 


182.1 


45 


Ets 


Ets-domain 


1.4e-56 


182.1 


46 


LRR 


Leucine Rich Repeat 


1.7e-13 


58.3 


48 


zf-C2H2 


Zinc finger, C2H2 type 


2.3e-162 


552.8 


49 


IT AM 


Immunoreceptor tyrosine -based 
activation mot 


1.4e-05 


31.9 


50 


UCH-2 


Ubiquitin carboxyl- terminal 
hydrolase family 


1 .le-26 


102.0 


51 


UCH-2 


Ubiquitin carboxyl -terminal 
hydrolase family 


1 .le-26 


102.0 


52 


ras 


Ras family 


6 .5e-45 


162 .3 


53 


PRK 


Phosphoribulokinase 


2 . le-65 


230.7 


S4 


myb_DNA- 
Dinaing 


Myb-like DMA-binding domain 


C . 096 


15 . 2 


55 


voltage_CLC 


Voltage gated chloride channels 


3.3e-186 


631.9 


56 


sugar_tr 


Sugar (and other) transporter 


0.00015 


-64.3 


57 


TBC 


TBC domain 


2.2e-37 


137.6 


58 


ank 


Ank repeat 


5.9e-25 


96.3 


59 


ank 


Ank repeat 


5.9e-25 


96.3 


67 


PMP22_Claudi 
n 


FMP-22/EMP/MP20/Claudin family 


7.9e-49 


175.6 


68 


C2 


C2 domain 


7.9e-54 


192.2 


69 


C2 


C2 domain 


2.3e-54 


194 .0 


70 


Kelch 


Kelch motif 


9.4e-99 


341 .5 


72 


3-g 


Immunoglobulin domain 


8.2e-28 


94.7 


73 


pkinase 


Eukaryotic protein kinase 


8e-69 


242 .1 
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SEQ ID 
NO: 


PFAM NAMF 


DESCRIPTION 


P X U S 


r r/\rl 
SCORE 






domain 






74 


pklna.se 


Eukaryotic protein kinase 
domain 


2 . 8e-38 


140.6 


76 


zf- 

C4_Topoisom 


Topoisomerase DNA binding C4 
zinc fing 


5 . 4e-54 


192 . 8 


83 


Peptidase S9 


Prolyl oligopeptidase family 


4 . 3e-l0 


36.8 


84 


fn3 


Fibronectin type III domain 


4 . le- 51 


183 . 2 


86 


SH2 


Src homology domain 2 


3 . le-22 


67.7 


88 


iq 
•*- _J 


Immunoglobulin domain 


0.0091 


14 . 0 


89 


WD4 0 


WD domain, G-beta repeat 


2.1e-21 


84 . 6 


92 


1 ami Ti H n 


Daminin G domain 


6 . le-2 7 


98 5 


93 


AMP-binding 


AMP-binding enzyme 


2.4e-13 


-37.2 




pkinase 


Eukaryotic protein kinase 


1 . 4e-59 


211.4 


qf r 1 


— , ■ 
pkinase 


Eukaryotic protein kinase 
domain 


/. . be-bl 


183 . 9 


□ •7 


a dh_s ho r t 


short chain dehydrogenase 


2 e- 6 1 


217.5 


□ Q 
3D 


kinesin 


Kinesin motor domain 


2 . 2e- 86 


3 00.4 


101 


IRS 


PTB domain (IRS-1 type) 


5.4e-36 


133 .0 


102 


AAA 


ATPases associated with various 
cellular act 


6 . 8e- 05 


-5.2 


104 


pkinase 


Eukaryotic protein kinase 
domain 


2.7e-73 


256 .9 


106 


ras 


Ras family 


8 .3e-24 


92.5 


107 


FYVE 


FYVE zinc finger 


5 . 4e-27 


100 . 7 


1UD 


i_yu reauctas 


FAD/NAD-binding Cytochrome 
reductase 


7 . 7e-61 


215 . 5 


109 


zf-C2H2 


Zinc finger, C2H2 type 


2 .3e-122 


420.0 




pkinase 


Eukaryotic protein kinase 
domain 


4e-88 


306.2 


116 


PH 


PH domain 


3 . le-11 


4 5.2 


117 


lipocalin 


T.I rtft^al T n / r"»\ ft" ^o/^It^i £ -n * j 

jjipoLaiin / cy tuBuiic xaccy- 
acid binding pr 


2 . 4e- 14 


53 . 5 




pkinase 


Eukaryotic protein kinase 

UUUUXlLl 


4 . 5e-2Q 


76.3 


12 0 


WD40 


nv \j\jWKXXLif \j "ucta icpeai 


x . *e — xfi 


o x . x 


121 


WD4 0 


WTl Hnma i n Ci — 

r»u uuiiiai.il | \-» Jjcua xts^JcaU 


O A e» _ 1 A 


OX . X 


123 


TF5 e»IF4 elF 
2 


pTP4 - nammfl / ■TT?C /aT'pn -anoi 1 rir 


XC A Z 


xZZ . A 


124 




T mmi l n rvni 1 rtVii tl 1 n H r\m a l n 
x iiiiiiuimy j. \ju\i±. xn uuiiidiii 


6 5e— 08 


JU.O 


127 


mito_carr 


Mitochondrial carrier proteins 


3e-16 


58.6 


12 8 


PP2C 


rrotciil pHOSpilaLaac ^L, 


2 . 2e-71 


2 50.6 


129 


ATP1CJ1 PT.M M 
AT 8 


ATP1 (11 /PT.M /MATPl faTnilvr 


3 . le - 20 


80.6 


130 


pfkB 


y>i.i\.o laiiuiy uaxxxjiiy uid lc Kinase 




1 J / . X 


133 


ACBP 






CD . / 


13 4 




DM! rormni t" i r>n moh i r 
ivi^r*. lemyiixuiuij inu l jl i. . 


x . ^e-jx 


TIB C 

IXo . D 


135 


10 


TO ra 1 modi 1 1 i n-hi nHi no mnhi f 


2 . 6e- 08 


*t X . u 


13 6 


ATP1G1_PLM_M 
AT 8 


ATP1G1/PLM/MAT8 family 


9.3e-22 


85.7 


139 


WH2 


Wiakot"t Al dri rh Rvnnrnmp 

naoA^LU ni.uJLXwil Dyilxjl^UUlC 

homology region 2 


0.0067 


«J . x 


140 


zf-C2H2 


Zinc finger, C2H2 type 


1.7e-82 


287.5 


141 


6 


o_i.yi.icix UC^J LxUaoC X 


5 . 7e - 10 




143 


arf 


ADP-ribosylation factor family 


1.2e-39 


145 .2 


146 


KRAB 


KRAB box 


7.3e-30 


112 .6 


148 


DUF6 


Integral membrane protein DUF6 


0 .096 


8.0 


14 9 


PDEase 


3' 5' -cyclic nucleotide 
phosphodiesterase 


3.8e-80 


231 . 1 


151 


S4 


S4 domain 


l.le-08 


42.3 


153 


tRNA-synt_JLd 


tRNA synthetases class I (R) 


3 .8e-103 


356.1 


154 


Cyt_reductas 
e 


FAD/NAD-binding Cytochrome 
reductase 


7.8e-60 


212.2 


155 


ras 


Ras family 


3 .6e-28 


107 .0 


157 


actin 


Actin 


3.8e~26 


87.1 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


158 


Jacal in 


Jacalin-like lectin domain 


0 . 09 


-24.9 


160 


Zn_carbOpept 


Zinc carboxypeptidase 


5e-138 


471.9 


165 


r\\r i nasp 


Eukaryotic protein kinase 
domain 


5 . le-67 


236.1 


167 


zf -C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5 . 3e-07 


27 . 0 


168 


Ribosomal_Sl 
5 


Ribosomal protein S15 


l.le-06 


29.0 


169 


DEAD 


DEAD/DEAH box helicase 


le-48 


157 .0 


171 


DUF59 


Domain of unknown function 
DUF59 


0 , 07 


-17 .4 


172 


plcina.se 


Eukaryotic protein kinase 
domain 


3 . 7e-15 


58 . 6 


lW 


globin 


Globin 


4.6e-18 


67.4 


174 


WW 


WW domain 


7 . 3e- 06 


32 . 9 


175 




T?ap» f amllv 


le-31 


118.8 


178 


ATP1G1JPLM_M 
AT 8 


ATP1G1/PLM/MAT8 family 


2.5e-l7 


71.0 


179 


zf -C2H2 




1 . 5e- 99 


344.2 


180 


Clq 


Clq domain 


8.8e-72 


251 .9 




IJpriOopilaLao 

e 


Drnhol n.("ur/ie1 no nhoanhaha eo 

rruLcin LyLUSinc ^juvjs^iiciuoioc 


4 9e-287 


967 0 


191 


c f hand 


PF hand 


7 . 5e- 16 


66 . 1 


193 


pkinase 


Eukaryotic protein kinase 


6.5e-82 


285 .6 


194 


bromo domain 


Bromo domain 


5.8e-31 


111 .4 




PALP 


ryriuuxai jji l cj sp n a l aepencieii t. 
enzyme 


4. . JC o «* 


C I . — 


1 QT 


uTiB.0 


DnaJ domain 




141 A 

14 1 . 4 


199 


RrnaAD 


Ribosomal RNA adenine 
dimethylases 


0.00018 


16.9 


■3 n n 


acid phospha 


T T 4 c-if- 4 /A •{ t-s ^ ^ r~* ^ ^V^oortVt a f a ^ o 

rii.5uj.uxnc aciu pnoapnacasc 






201 


WH2 


Wi Kknt" (■ Ai f?ri fivnrirnmp 

hofnology region 2 


0 00048 


26.9 


2 04 


vATP- 
synt AC39 


ATP synthase (C/AC3 9) subunit 


1 . 3e-159 


543 .7 


205 


vATP- 
synt AC3 9 


ATP synthase (C/AC39) subunit 


1.6e-139 


476.9 


206 


ldl_recept_a 


Low-density lipoprotein 
receptor domain 


2.4e-25 


97.6 


209 


auk 


Ank repeat 


1 . 4e-19 


78 .4 


210 


Rhomboid 


Rhomboid family 


0 . 0035 


1 . 2 


211 


Cla 


Clq domain 


1 . 6e-70 


247.7 


212 


UQ_con 


Ubi qui tin -conjugating enzyme 


7. 4e-74 


258 .8 


213 


UQ con 


Ubiquit in- conj uga t ing enzyme 


le-53 


191.9 


215 


DEAD 


DEAD/DEAH box helicase 


1.8e-43 


140.4 


216 


PMP22 Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


4 . 5e-21 


83.4 


218 


Glycos trans 
f_2 


Glycosyl transferases 


4e-21 


83 .6 


219 


lg 


Immunoglobulin domain 


0 . 092 


10.7 


222 


WD4 0 


WD domain, G-beta repeat 


7.4e-23 


89 .4 


224 


TPR 


TPR Domain 


1.2e-08 


42 .1 


225 


DnaJ__CXXCXGX 
G 


DnaJ central domain (4 repeats) 


1 .5e-38 


141 . 5 


226 


DnaJ_CXXCXGX 
G 


DnaJ central domain (4 repeats) 


1.5e-38 


141.5 


229 


HSP70 


Hsp70 protein 


2.4e-54 


194.0 


230 


GSHPx 


Glutathione peroxidases 


3 . 4e-47 


170.2 


231 


tsp_l 


Thrombospondin type 1 domain 


0 . 0075 


17.1 


233 


cyclin 


Cyclin 


4 . 6e-144 


492.0 


234 


ras 


Ras family 


4 . 8e-S0 


179.7 


235 


LRR 


Leucine Rich Repeat 


1 . 2e-30 


115.3 


236 


LRR 


Leucine Rich Repeat 


6 .7e-29 


109 .4 


237 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


1.7e-69 


45 . 0 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


244 


dCMP_cyt_dea 
m 


Cytidine and deoxycytidylate 
deaminase 


2.5e-05 


31.1 


245 




Immunoglobulin domain 


6 .7e-08 


30.5 


248 


wnt 


wnt family of developmental 
signaling protei 


9.1e-270 


742.6 


250 


miCo_carr 


Mitochondrial carrier proteins 


1.3e-55 


193.6 


254 


adenylatekin 
ase 


Adenylate kinase 


1.8e-14 


55.7 


255 


Cation_ef f lu 

X 


Cation efflux family 


2.8e-33 


124.0 


256 


SH3 


SH3 domain 


3.9e-14 


60.4 


257 


Aa_ trans 


Transmembrane amino acid 
transporter protein 


2.6e-52 


187.2 


258 


adenylatekin 
ase 


Adenylate kinase 


2-le-110 


380.2 


259 


HIT 


HIT family 


8.2e-07 


25.3 


260 


Bacterial PQ 
Q 


PQQ enzyme repeat 


1.6e-15 


65.0 


262 


p rot ea some 


Proteasome A- type and B-type 


6.5e-64 


225.7 


267 


pkinase 


Eukaryotic protein kinase 
domain 


6 .3e-27 


101.0 


270 


filament 


Intermediate filament proteins 


3 .2e-150 


512.5 


271 


Choline_kina 
se 


Choline/ethanolamine kinase 


2e-67 


237.4 


277 


Ribosomal_S7 


Ribosomal protein S7p/S5e 


3 .3e-20 


80 .6 


279 


pkinase 


Eukaryotic protein kinase 
domain 


3 .3e-77 


269.9 


280 


WD4 0 


WD domain, G-beta repeat 


7.8e-73 


255.4 


281 


WD40 


WD domain, G-beta repeat 


7 . 8e-73 


255.4 


284 


zf-DHHC 


DHHC zinc finger domain 


4 .6e-24 


93.4 


287 


Exonuclease 


Exonuclease 


1.4e-67 


238.0 


291 


SAM 


SAM domain (Sterile alpha 
motif) 


0 . 034 


11.2 


292 


SAM 


SAM domain (Sterile alpha 
motif) 


0 . 034 


11.2 


294 


zf -C2H2 


Zinc finger, C2H2 type 


1.4e-29 


111.7 


295 


zf-C2H2 


Zinc finger, C2H2 type 


2,2e-125 


430.0 


296 


mito carr 


Mitochondrial carrier proteins 


4 . le-59 


205.5 


297 


HMG_box 


HMG (high mobility group) box 


6.7e-29 


109.4 


302 


Glycos_trans 
f_4 


Glycosyl transferase 


5e-87 


302.5 


304 


tRNA-synt_2 


tRNA synthetases class II (D, K 
and N) 


1 . le-B4 


294.8 


305 


KRAB 


KRAB box 


2e-44 


161.0 


306 


rrm 


RNA recognition motif. 


2.7e-44 


160.6 


308 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


5.2e-39 


12^.1 


3 09 


DNA_polymera 
seX 


DNA polymerase X family 


2 .4e-64 


227.2 


311 


F-box 


F-box domain. 


9 .5e-08 


39.2 


312 


ig 


Immunoglobulin domain 


6.8e-19 


65.9 


313 


Ets 


Ets-domain 


8 . le-60 


192.3 


315 


Kelch 


Kelch motif 


I.3e-106 


367.6 


317 


arf 


ADP-ribosylation factor family 


3 ,2e-35 


130.4 


318 


sugar_tr 


Sugar (and other) transporter 


0.0003 


-73 .1 


320 


pkinase 


Eukaryotic protein kinase 
domain 


8 . le-83 


288.6 


322 


pkinase 


Eukaryotic protein kinase 
domain 


4 .9e-81 


282 .6 


324 


Xlink 


Extracellular link domain 


4 .5e-143 


331. S 


326 


ARID 


ARID DNA binding domain 


5.1e-37 


136.4 


327 


HMG box 


HMG (high mobility group) box 


6.7e-29 


109.4 


328 


cadherin 


Cadherin domain 


B . le-81 


281.9 


331 


chromo 


'chromo' (CHRromatin 
Organization Modifier) 


4e-18 


66.7 


333 


Peptidase_M2 
2 


Glycoproteaee family 


1.2e-136 


" 467.4 
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SEQ ID 

NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORB 


335 


vwa 


von Willebrand factor type A 
domain 


2 .3e-07 


37.9 


339 


ras 


Ras family 


7 . 8e-07 


-59 . 1 


340 


zf-C2H2 


Zinc finger, C2H2 type 


8 .2e-64 


225 .4 


342 


zf-C2H2 


Zinc finger, C2H2 type 


2.4e-85 


297 .0 


343 


ig 


Immunoglobulin domain 


0 . 0005 


18 . 0 


346 


pkinase 


Eukaryotic protein kinase 
domain 


6 .5e-65 


229.1 


347 


pkinase 


Eukaryotic protein kinase 
domain 


6 .5e-65 


229.1 


351 


EGF 


EGF-like domain 


8 .5e-20 


79.2 


352 


ank 


Ank repeat 


2 . 5e-101 


350.0 


354 


TBC 


TBC domain 


5 . le-15 


63 .3 


355 


PHD 


PHD- finger 


3 ,2e-07 


37.4 


358 


DUF6 


Integral membrane protein DUF6 


0 .033 


15.8 


359 


zf-C2H2 


Zinc finger, C2H2 type 


7.4e-20 


79.4 


361 


ank 


Ank repeat 


6 .6e-34 


126.1 


362 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


4 .7e-53 


189.7 


363 


ef hand 


EF hand 


5 .4e-10 


46 . 6 


367 


LRR 


Leucine Rich Repeat 


8 . 8e-44 


158 . 9 


368 


laminin_G 


Laminin G domain 


1 .5e-33 


121 . 7 


369 


PP2C 


Protein phosphatase 2C 


5 .3e-20 


73 . 9 


372 


LIM 


LIM domain containing proteins 


9 .9e-15 


57 . 1 


373 


KRAB 


KRAB box 


4 . 8e-23 


90.0 . 


3 76 


ion_trans 


Ion transport protein 


2 .9e-09 


-4.2 


377 


Beach 


Beige /BEACH domain 


4 .9e-208 


704 . 5 


380 


pkinase 


Eukaryotic protein kinase 
domain 


1 .6e-94 


327.5 


381 


AMP-binding 


AMP-binding enzyme 


1.4e-07 


-140.3 


382 


HECT 


HECT-domain (ubiquitin- 
transferase) . 


1.3e-07 


-13.5 


384 


ank 


Ank repeat 


2.5e-L01 


350.0 


386 




Immunoglobulin domain 


9,5e-06 


23 .6 


388 


zf-C2H2 


Zinc finger, C2H2 type 


1.7e-42 


154.6 


389 


ig 


Immunoglobulin domain 


2 .8e-15 


54.3 


390 


mito_carr 


Mitochondrial carrier proteins 


3 .5e-67 


233.2 


392 


TPR 


TPR Domain 


6 .le-17 


69 . 7 


393 


SH3 


SH3 domain 


3.5e-09 | 43.9 j 


394 


AAA 


ATPases associated with various 
cellular act 


4.16-21 


83 . 6 


396 


spectrin 


Spectrin repeat 


2.1e-67 


237.3 


397 


zf-C2H2 


Zinc finger, C2H2 type 


0.0066 


23 .1 


399 


fn3 


Fibronectin type III domain 


4.1e-102 


352.6 


400 


WD40 


WD domain, G-beta repeat 


0 .00049 


26 . 8 


401 


El dehydrog 


Dehydrogenase El component 


3e-119 


409.6 


402 


£n3 


Fibronectin type III domain 


0 


1719.6 


404 


LRR 


Leucine Rich Repeat 


2.1e-10 


48 .0 


405 


cadherin 


Cadherin domain 


8 .le-81 


281. 9 


406 


zf-CXXC 


CXXC zinc finger 


5e-15 


63 .4 


410 


RhoGEF 


RhoGEF domain 


l.le-23 


92.1 


411 


F-box 


F-box domain. 


4 .2e-06 


33 .7 


412 


SNF2_N 


SNF2 and others N- terminal 
domain 


5.8e-16 


61.6 


415 


CPSaae_L_cha 
in 


Carbamoyl -phosphate synthase 
(CPSaee) 


1.5e-172 


586.6 


418 


LRR 


Leucine Rich Repeat 


3 .8e-24 


93.6 


419 


DENN 


DENN (AEX-3) domain 


2e-58 


207.5 


420 


RasGEF 


RasGEF domain 


8 .le-43 


155.7 


421 


ank 


Ank repeat 


1.4e-153 


523 .7 


424 


G -patch 


G-patch domain 


le-19 


78.9 


425 


pkinase 


Eukaryotic protein kinase 
domain 


2.2e-31 


117.1 


426 


Plexin_repea 
t 


Plexin repeat 


0 . 0023 


24.6 


427 


Plexin_repea 


Plexin repeat 


0. 0023 


24.6 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 




t 








429 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


B.6e-ll 


39.2 


431 


DEAD 


DEAD/DEAH box helicase 


le-66 


214 . 0 


432 


SH3 


SH3 domain 


3 .4e-i6 


67.2 


433 


GTP_CDC 


Cell division protein 


2 . le-114 


393 .5 


436 


Collagen 


Collagen triple helix repeat 
(20 copies) 


4 .6e-194 


658.1 


438 


Ricin B lect 
in 


Similarity to lectin domain of 
ricin b 


0 . 0085 


10 . 5 


441 


Alpha_adapti 
n C 


Alpha adaptm carboxyl- terminal 
domai 


1 .2e-256 


866 . 0 


442 


Alpha adapt i 
n_C 


Alpha adapt in carboxyl- terminal 
domai 


1 . 8e-235 


795 . 7 


443 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


1.9e-65 


230.9 


445 


LON 


ATP -dependent protease La (LON] 
domain 


0 . 00012 


-17.1 


446 


ig 


Immunoglobulin domain 


0 . 00011 


20.1 


,451 


sushi 


Sushi domain (SCR repeat) 


1 . 4e-18 


75 . 2 


452 


£n3 


Fibronectin type III domain 


1 . 5e-06 


35 . 2 


454 


pyridoxal de 
C 


Pyridoxal -dependent 
decarboxylase conse 


8 . 3e-14 


50 . 3 


456 


kinesin 


Kinesin motor domain 


4 . 9e-217 


734 . 4 


457 


neur^chan 


Neurotransmitter-gated ion- 
channel 


le-175 


597 . 1 


458 


Josephin 


Josephin 


0 . 0002 


18.7 


468 


bZIP 


bZIP transcription factor 


1 . 7e-07 


31 . 8 


470 


NTP_transfer 
ase 


Nucleotidyl transferase 


6 . 3e-06 


-26.3 


471 


WD40 


WD domain, G-beta repeat 


2e-28 


107.9 


473 


LIM 


LIM domain containing proteins 


0 . 00021 


20.7 


477 


zf-RanBP 


Zn- finger in Ran binding 
protein and others. 


0 .028 


21.0 


479 


WD40 


WD domain, G-beta repeat 


6 .5e-18 


73 . 0 


480 


KRAB 


KRAB box 


le-31 


118 . 8 


481 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


8.4e-66 


232.0 


485 


SH2 


Src homology domain 2 


0.011 


11.4 


486 


Clg 


Clq domain 


4 .3e-74 


259 . 6 


487 


dsrm 


Double- Btranded RNA binding 
motif 


1 . le-47 


171 . 9 


489 


zf-C2H2 


Zinc finger, C2H2 type 


4.8e-153 


521.9 ~~~ 


490 


Alpha adapt i 
n_C 


Alpha adapt in carboxyl- terminal 
domai 


3.4e-222 


751.6" 


492 


SKI 


Shikimate kinase 


1 .2e-l0 


48.8 


497 


ENVjpolyprot 
ein 


ENV polyprotein (coat 
polyprotein) 


2.6e-22 


77.6 


498 


abhydrolase 
2 


Phosphollpase/Carboxylesterase 


0.041 


-48 . 1 


500 


rrm 


RNA recognition motif. 


5 .4e-34 


126 .4 


501 


Ww 


WW domain | 4 .6e-18 


73.4 ~ 


502 


ig 


Immunoglobulin domain 


l.le-10 


39 . 5 


504 


abhydrolase 


alpha/beta hydrolase fold 


0 .045 


-3.6 


505 


vwa 


von Willebrand factor type A 
domain 


7.1e-62 


219.0 


508 


Na_K_ATPase 

c 


Na+/K+ ATPase C- terminus 


2 .3e-145 


496 .3 


509 


Exonuclease 


Exonuclease 


1.3e-56 


201,5 


510 


Glycos trans 
f_l 


Glycosyl transferases group 1 


2.9e-06 


27.0 


511 


Glycos trans 
f_l 


Glycosyl transferases group 1 


2.9e-06 


27.0 


512 


Glycos trans 
f_l 


Glycosyl transferases group 1 


1.9e-09 


38.5 


514 


pro_isomeras 
e 


Cyclophilin type pep t idyl - 
prolyl cis-tr 


1.8e-63 


221 .4 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 

SCORE 


515 


EGF 


■ 

EGF- like domain 


1 . 9e- 18 


74 . 7 


516 


Surp 


Surp module 


4.3e-38 


140 .0 


523 


i9 


Immunoglobulin domain 


3 . 3e- 06 


25 . 0 


526 


UBX 


UBX domain 


1 . le- 34 


128 . 6 


528 


adh zinc 


Zinc -binding dehydrogenases 


2.7e-34 


127 .4 


530 


SAM 


SAM domain (Sterile alpha 
motif) 


0.046 


10 . 0 


531 


adh_short 


short chain dehydrogenase 


0 . 0025 


-34.1 


532 


nito_carr 


Mitochondrial carrier proteins 


2.5e-81- 


281.7 


533 


mito carr 


Mitochondrial carrier proteins 


2e-6l 


213 . 5 


534 


thiolase 


Thiolase 


3.5e-l83 


622 .0 


535 


FMO-like 


Flavin-binding monooxygenase- 
like 


0 


1153 .7 


536 


SCAN 


SCAN domain 


4e-55 


196 .6 


537 


tRNA- syn t_l 


tRNA synthetases class I {I, L, 
M and V) 


3.1e-136 


466.0 


538 


tRNA-synt_l 


tRNA synthetases class I (I, L, 
M and V) 


3 .le-136 


466 .0 


539 


tRNA-synt_l 


tRNA synthetases class I (I, L, 
M and V} 


1. 9e-117 


403 .6 


540 


tRNA-6ynt_l 


tRNA synthetases class I (I, L, 
M and V) 


3. le-136 


466 . 0 


541 


vATP-synt_E 


ATP synthase (E/31 kDa) subunit 


5.9e-85 


295 . 7 


543 


Zf-C2H2 


Zinc finger, C2H2 type 


5.5e-^9 


242. £ 


544 


DUF101 


Protein of unknown function 
DUF101 


8.5e-38 


139 .0 


545 


TGFbjpropept 
ide 


TGF-beta propeptide 


l.le-67 


238 .2 


547 


WD4 0 


WD domain, G-beta repeat 


2 . 6e-32 


120 . 8 


548 


RHD 


Rel homology domain (RHD) . 


• 1.6e-238 


686 .2 


549 


MMR HSR1 


GTPaee of unknown function 


5.4e-67 


236 . 0 


551 


HECT 


HECT-domain (ubiquitin- 
transf erase) . 


4 .3e-127 


435 .6 


554 


MHC_II_alpha 


Class II histocompatibility 
antigen, alp 


3 .5e-74 


259 . 8 


555 


zf-UBRl 


Putative zinc finger in N- 
recognin 


3 .3e-16 


67 .3 


556 


Kelch 


Kelch motif 


5.5e-29 


109. 7 


561 


AMP-binding 


AMP-binding enzyme 


2. 8e-06 


-163 .7 


562 


PABP 


Poly-adenylate binding protein, 
unique domai 


4.9e-38 


139 .8 


564 


Gag_p3 0 


Gag P30 core shell protein 


1 . 2e-67 


238 . 2 


566 


PWWP 


PWWP domain 


B . le-16 


66 . 0 


567 


SCAN 


SCAN domain 


7 . 3e-68 


238 . 9 


569 


pkinase 


Eukaryotic protein kinase 
domain 


1.5e-84 


294.3 


570 


pkinase 


Bukaryotic protein kinase 
domain 


1 . 5e-84 


294 . 3 


571 


CN_hy dr o 1 a s e 


Carbon -nitrogen hydrolase 


0 . 00081 


— 73 — *5 

-79.7 


572 


myosin head 


Myosin head (motor domain) 


__ 

u 


14 95.2 


573 


rnyosin_head 


Myosin head (motor domain) 


0 


14 90.4 


575 


Surp 


Surp module 


1 . 7e-23 


91.5 


576 


Surp 


Surp module 


1 . 7e- 23 


91.5 


577 


DNA_pol_B 


DNA polymerase family B 


0 


113 8.6 


578 


PDZ 


PDZ domain (Also known as DHR 
or / . 


8 . 3e-09 


42,7 


579 


LRR 


Leucine Rich Repeat 


4 . 9e-21 


83 .3 


580 


neur_chan 


Neurotransmitter-gated ion- 
channel 


5.9e-177 


6"01.3 


583 


sushi 


Sushi domain (SCR repeat) 


0 


1673.0 


584 


DEAD 


DBAD/DEAH box helicase 


7.3e-36 


116 .3 


586 


KH- domain 


KH domain 


2.9e-13 


57 .5 


587 


G-patch 


G-patch domain 


2.3e-14 


61 .2 


589 


LIM 


LIM domain containing proteins 


2.3e-36 


133.4 


590 


broraodomain 


Bromodomain 


6.6e-32 


114 . 7 


591 


broraodomain 


Bromodomain 


6.6e-32 


114 . 7 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


592 


hormcne_rec 


Ligand- binding domain of 
nuclear hormone 


3 . 5e-2 2 


87 . 1 


593 


PHD 


PHD- finger 


3 . 8e- 12 


53 . 8 


594 


cadherin 


Cadherin domain 


4 . 2e- 99 


342 . 7 


596 


pkinase 


Eukaryotic protein kinase 
domain 


5e-92 


319.2 | 


con 


WD4 0 


WD domain, G-beta repeat 


0 . 00U 54 


26 . 7 


600 


FG - GAP 


FG-GAP repeat 


4.3e-75 


262.9 


602 


G_Adapt_CT 


Gamma -adapt in, C-terminus 


1 . le-53 


191 . 8 


603 


pkinase 


Eukaryotic protein kinase 
domain 


2 . 3e- B6 


300 . 4 


605 


Collagen 


Collagen triple helix repeat 
{20 copies) 


8e-42 


152.4 


606 


mito_carr 


Mitochondrial carrier proteins 


6 . 3e-67 


232.3 


608 


PWWP 


PWWP domain 


2 .6e-28 


107.5 


609 


PWWP 


PWWP domain 


2.6e-28 


107.5 


613 


CAP_GLY 


CAP-Gly domain 


0 . 0046 


20.1 


615 


RFX_DNA_bind 
ing 


RFX DNA- binding domain 


5.2e-54 


192.9 


616 


kinesin 


Kinesin motor domain 


1. le-81 


284 . 8 


617 


kinesin 


Kinesin motor domain 


8.4e-80 


278 .5 


618 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.0098 


13.1 


620 


MATH 


MATH domain 


7.8e-C5 


22.2 


621 


Y__phosphatas 
e 


Protein- tyrosine phosphatase 


1.4e-32 


121.6 


622 


pkinase 


Eukaryotic protein kinase 
domain 


4 . 4e-40 


146.6 


623 


BNR 


BNR repeat 


2. le-ll 


51.3 


624 


molybdopteri 
n 


Prokaryotic molybdopterin 
oxidoreductas 


1 . 4e-12 


42.2 


625 


TPR 


TPR Domain 


1 . le-17 


72 .2 


627 


cNMP_binding 


Cyclic nucleotide-binding 
domain 


3 . 7e-58 


206 .6 


630 


adh_short 


short chain dehydrogenase 


5e-17 


70.0 


631 


zf -C2H2 


Zinc finger, C2H2 type 


2 . le-88 


307.1 


632 


rrm 


RNA recognition motif . 


4e-05 


30.5 


635 


pkinase 


Eukaryotic protein kinase 
domain 


1.6e-104 


360.7 


636 


Fork_head 


Fork head domain 


5 . 9e-27 


103 .0 


637 


pkinase 


Eukaryotic protein kinase 
domain 


3 . 8e-70 


246 .5 


642 


TPR 


TPR Domain 


4 . 8e-08 


40.1 


643 


ef hand 


EF hand 


1 . 9e-27 


104 .6 


647 


SNF2_N 


SNF2 and others N- terminal 
domain 


1 . 2e-101 


351.1 


648 


PseudoU_eynt 
h 2 


RNA pseudouridylate synthase 


1 . 9e-55 


197 .6 


650 


zf -C2H2 


Zinc finger, C2H2 type 


0 . 0087 


22 . 7 


651 


ank. 


Ank repeat 


1 . 3e-17 


71 . 9 


6 52 


l_LiWEQ 


i/LiWbw domain 


9 . 5e- 101 


341.0 


b b J 


neur chan 


Neurotransmitter-gated ion- 
channel 


4 . le-171 


581 . B 




tsp_l 


Thrombospondin type 1 domain 


4 . le-47 


169.9 


f C ft 

Ob? 


FH2 


Formin Homology 2 Domain 


le - 107 


3 71.2 


661 




Poti Hrwna i n — H-l-prmi nal f-n 

lr<*s\4. Wl-JUta-LII IX L» CL ml. 1.1C* X IU 

homeobox domain 




162.9 


662 


C2 


C2 domain 


6.7e-19 


76.2 


663 


C2 


C2 domain 


6.7e-19 


76.2 


664 


C2 


C2 domain 


6.7e-19 


76.2 


667 


GST 


Glutathione S-transf erases . 


9.3e-34 


114 .4 


668 


LRR 


Leucine Rich Repeat 


9.3e-3l 


115.6 


670 


spectrin 


Spectrin repeat 


4e-57 


203.2 


671 


I_LWEQ 


I/LWEQ domain 


9.5e-101 


341 .0 


672 


ABC_tran 


ABC transporter 


5.3e-60 


212 .8 


674 


WD40 


WD domain, G-beta repeat 


4 .8e-24 


93 .3 
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SEQ ID 
NO : 


PFAM NAME 


DESCRIPTION 
— — — 


p-value 


PFAM 


675 


WD40 


WD domain/ G-beta repeat 


A Q o _ 1 A 


Q9 " i 


676 


LRR 


Leucine Rich Repeat 


0 .0015 


25.2 


679 


zf -CCCH 


Zinc finger C-x8-C-x5-C-x3-H 
type 


2 . 6e- 29 


107 . 7 


S80 


z£-C2H2 


Zinc finger, C2H2 type 


5.2e-05 


30 .1 


S81 


CH 


Calponin homology (CH) domain 


1 A a T 9 




682 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


A 1a A 1 


'ice c ' 


683 


Zt - CJHL4 


iiinc linger, uj/il.** type 
f inger ) 


c\ nm 


in r 


687 


._ 

Synapsin 






1890 . 8 


689 


rKb b 


rtULclil pxjLj£»^>ild L a be Z>v 
▼*o/tii1 ^^r\v~\T Cnniim Ol? 


o 


1038 . 8 


691 


homeobox 


Homeobox domain 




LXZ . t 


f Of 


pept iaase_MZ 
4 


ma hall Anonh i^seo f am ^ T *.r MO/1 
UlcCallopcptlUaSc J»eAiul_Ly Fiz.'k 


Z - DC D 3 


210.5 


697 


RhoGEF 


RhoGEF domain 


9.5e-35 


128 .9 


698 


PHD 


PHD- finger 


u . uuo 


Q 1 


701 


zf -C2H2 


Zinc finger, C2H2 type 




499 n 


702 


Sulf atase 


Sulf atase 


la 9 9 1 


9 Q 1 £ 


7 03 


zf -C2H2 


^lnc linger, Lztiz cype 


~> . /e- zu 




707 


Acyl_transf 


Acyl transferase domain 


l.le-22 


88.8 


708 


WD4 0 


WD domain, G-beta repeat 


4 . 8e-19 


76.7 


710 


Ran_BPl 


RanBPl domain. 


8 .4e-06 


-7.3 


713 


DEAD 


DEAD/DEAH box he li case 


9 . 9e-42 


134.9 


714 


PH 


PH domain 


1 . 6e-09 


39.0 


715 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


1 . 5e-37 


138.2 


717 


Sialyl trans f 


Sialyltransf erase family 


7 . 5e-3 1 


XlD. 3 


718 




Immunoglobulin domain 


le-29 


100.8 


719 


integrin__B 


Integrins, beta chain 


0 


1125.4 


720 


zf -C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


1 . le-08 


32.4 


722 


Pep t Idas e_C2 


Calpain family cysteine 
protease 


3e-14S 


495.9 


723 


ig 


Immunoglobulin domain 


2 .2e-05 


22.4 


724 


F-box 


F-box domain. 


0 . 007 


23 . 0 


725 


Nop 


Putative snoRNA binding domain 


8 . le- 58 


2 05.5 


726 


Nop 


Putative snoRNA binding domain 


8 . Ie-58 


9 n c c 
2 Ob . b 


727 


WD40 


WD domain, G-beta repeat 


n c a ic 


9 9.3 


730 


dsrm 


Double- stranded RNA binding 
motif 


0 . 027 


12 . 1 


731 


dynamin 


Dynamin family 


4 .2e-16 


66.9 


733 


z f - CCCH 


Zinc finger C-x8-C-x5-C-x3-H 
type 


9 Do.l n 

z . oe-iu 


A 1 9 

41./ 


9 9 c 


CDP - 


r^Vir*«mVi» ¥ A rivl fc ranflf Ara a-* 


4 . 2e- 26 


100.1 


738 


DEAD 


DEAD/DEAH box helicase 


8.6e-57 


182.5 


73 9 


TSC22 




6 5e-32 


119 . 5 


742 


ras 


Ras family 


2.2e-100 


346.9 


743 


PMi typei 


Phosphomannose i some rase type I 


"1 9 » — 9 A 9 


822 9 


747 


trypsin 


Tryps in 




9 9 Q A 


748 


Jcazal 


Kazal-type serine protease 
inhibitor domain 


' 9 9o 




74 9 


ef hand 


EF hand 


c 9o - n c 
o . j e - u o 


"4 9 1 


/ ox 


PHD 


PVTn - ■F i nero t~ 
rr+u j. i.iy ex 


4 . 9e-l6 


66 . 7 


752 


zf-C2H2 


Zinc finger, C2H2 type 


3 .2e-21 


83 . 9 


753 


Hydrolase 


haloacid dehalogenase-like 
hydrolase 


6.1e>ll 


49.8 


754 


Ribosomal_L3 
9 


Ribosomal L3 9 protein 


0.00018 


26.7 


755 


PH " 


PH domain 


3.6e-14 


55. 7 


758 


SCAN 


SCAN domain 


1.4e-53 


191.5 


759 


PA 


PA domain 


0 .0065 


23.1 


760 


arf 


ADP-ribosylation factor family 


2.2e-19 


77. 8 


761 


CIDE-N 


CIDE-N domain 


2.2e-40 


147.6 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


762 


histone 


Core histone H2A/H2B/H3 /H4 


9 .9e-53 


188.6 


763 


zf-MYND 


MYND finger 


4 .le-14 


60.3 


764 


pou 


Pou domain - N- terminal to 
homeobox domain 


le-52 


188.6 


767 


vwc 


von Willebrand factor type C 
domain 


2 . 9e-34 


127 .3 


769 


efhand 


EF hand 


4 . 8e-ll 


50. 1 


770 


zf-C4 


Zinc finger, C4 type (two 
domains) 


2 .4e-53 


1B1 . 6 


772 


ras 


Ras family 


7e-90 


312 . 0 


773 


Sulf atase 


Sulf atase 


le-142 


487.5 


775 


zf-C2H2 


Zinc finger, C2H2 type 


1 .le-12 


55.5 


776 


zf-C2H2 


Zinc finger, C2H2 type 


1 .le-12 


55.5 


111 


zf-C2H2 


Zinc finger, C2H2 type 


1 .le-12 


55.5 


778 


rrm 


RNA recognition motif. 


2 .le-32 


121.1 


779 


G6PD 


Glucose- 6 - phosphate 
dehydrogena se 


1.5e-76 


236.6 


780 


spectrin 


Spectrin repeat 


3 . 7e-29 


110 .3 


781 


mito_carr 


Mitochondrial carrier proteins 


4 .6e-57 


198.5 


782 


SCAN 


SCAN domain 


1.3e-24 


95.2 


783 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


4 ,le-07 


37.1 


785 


DEAD 


DEAD/DEAH box helicase 


6e-06 


21. 7 


786 


ras 


Ras family 


5.3e-39 


143.0 


787 


RNase_HII 


Ribonuclease HII 


2 .5e-67 


237.1 


790 ! 


PI3_PI4jcina 
se 


Phosphatidyl inositol 3- and 4- 
kinases 


5.4e-108 


372 .2 


795 


cadherin 


Cadherin domain 


2 .5e-40 


147 .4 


796 


ARID 


ARID DNA binding domain 


1 .6e-20 


81 . 6 


797 


trypsin 


Trypsin 


9.9e-20 


64 . 8 


799 


CH 


Calponin homology (CH) domain 


3 .7e-15 


63 . 8 


801 


Gal- 

bind_lectin 


Vertebrate galactoside-binding 
lectin 


4 -le-25 


88 . 7 


803 


WD40 


WD domain, G-beta repeat 


0.00082 


26.1 


806 


TBC 


TBC domain 


1.8e-26 


101.4 


807 


TBC 


TBC domain 


1 .8e-26 


101.4 


808 


CN_hydrolase 


Carbon-nitrogen hydrolase 


8 .8e-B0 


278. 5 


811 


CBFD^N F YB_HM 
F 


Histone-like transcription 
factor 


6e-14 


59.8 


812 


adh_short 


short chain dehydrogenase 


B.le-20 


79.3 


814 


IMP4 


Domain of unknown function 


3 .3e-71 


250.0 


815 


zf-C2H2 


Zinc finger, C2H2 type 


8.2e-66 


232.1 


816 


Pept_tRNA_hy 
dro 


Peptidyl-tRNA hydrolase 


1.6e-37 


138.0 


817 


ARID 


ARID DNA binding domain 


2 .5e-lB 


74 .3 


826 


IF5_eIF4_eIF 
2 


eIF4-gamma/eIF5/eIF2-epsilon 


1. 6e-32 


121 .5 


830 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


1.5e-53 


191 .3 


831 


LRR 


Leucine Rich Repeat 


2.1e-26 


101.1 


832 


laminin_EGF 


Larainin EGF-like (Domains III 
and V) 


2e-57 


204 .2 


839 


rrm 


RNA recognition motif. 


1.3e-22 


88 .5 


840 


Y_phosphatas 
e 


Protein- tyrosine phosphatase 


2.6e-119 


409?8 


841 


pkinase 


Eukaryotic protein kinase 
domain 


3 .46-100 


346 . 3 


844 


Ribosomal_L2 
2e 


Ribosomal L22e protein family 


le-64 


228.4 


846 


IBR 


IBR domain 


9e-15 


62.5 ; 


849 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


7.4e-07 


26.5 


850 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0 .00016 


18.9 


851 


SET 


SET domain 


5e-30 


113 ,2 


Q52 


SRCR 


Scavenger receptor cysteine - 


0 I 


1025.4 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 






rich domain 






~BS3 


SRCR 


Scavenger receptor cysteine- 
rich domain 


0 


1025 . 4 


857 


lactamase__B 


Metal lo-beta- lactamase 
superfamily 


0.012 


-6 . 0 


858 


COX6A 


Cytochrome c oxidase subunit 
Via 


3.4e-S8 " 


206 . 7 


B59 


rrm 


RNA recognition motif. 


5.4e-4S 


162 . 9 


861 


PRK 


Phosphoribulokinase 


5.1e-62 


219.4 


863 


mito carr 


Mitochondrial carrier proteins 


2 .9e-53 


185 . 5 


864 


HSP90 


Hsp90 protein 


4 . 7e-158 


538 . 5 


866 


ig 


Immunoglobulin domain 


4e-12 


44 .1 


867 


zf-C2H2 


Zinc finger, C2H2 type 


7e-135 


461 5 


872 


histone 


Core histone H2A/H2B/H3/H4 


4 . 9e-41 


149 8 


874 


CPSase__L_cha 
in 


Carbamoyl -phosphate synthase 
(CPSaee) 


2 . le-218 


739 0 


879 


Ribosomal SI 
2e 


Ribosomal protein S12e 


2 . le-98 


340 3 


882 


serpin 


Serpins (serine protease 
inhibitors) 


2 . 5e-42 


145 . 7 


883 


Patatin 


Patatin 


1 . 2e-51 


182 0 


884 


RA 


Ras association (RalGDS/AF-6) 
domain 


0 . 044 


8 . 0 


887 


DUF92 


Integral membrane protein DUF92 


2 .7e-12 


54 . 3 


889 


sugar_tr 


Sugar (and other) transporter 


8 . 2e-63 


2i2 . 1 


893 


DUF28 


Domain of unknown function 
DUF28 


1.3e-43 


158.3 


896 


IP_trans 


Phosphatidylinositol transfer 
protein 


6 .5e-98 


338.7 


898 


DEAD 


DEAD/DEAH box helicase 


1 . 5e-48 


156 .5 


899 


KE2 


KE2 family protein 


7e-61 


215 .7 


900 


KE2 


KE2 family protein 


4 .3e-51 


183 . 2 


901 


zf-C2H2 


Zinc finger, C2H2 type 


2 .7e-57 


203 . 8 


902 


ras 


Ras family 


2.3e-75 


263 . 8 


904 


TPR 


f TPR Domain 


3 .2e-22 


87.2 


906 


GBP 


Guanylate -binding protein 


8 ,9e-2S3 


853 .1 


907 


GBP 


Guanylate- binding protein 


1 . le-239 


809 . 6 


9D8 


WD4 0 


WD domain, G-beta repeat 


2 . 6e-26 


100 . 8 


909 


PH 


PH domain 


1.3e-09 


39.4 


910 


zf-C2H2 


Zinc finger, C2H2 type 


2 . 5e-39 


144 . 1 


913 


Epimerase 


NAD dependent 

epimerase /dehydratase family 


5e-07 


-88.5 


921 


TBC 


TBC domain 


1 .Se-09 


30.7 


922 


WD40 


WD domain, G-beta repeat 


1 . 6e-25 


98 . 2 


923 


WD4 0 


WD domain, G-beta repeat 


8.2e-07 


36.1 


924 


Hydrolase 


haloacid dehalogenase-like 
hydrolase 


2 . 9e-05 


29 . 1 


925 


UQ_con 


Ubigui tin- conjugating enzyme 


0 . 00033 


-27 .6 


926 


CH 


Calponin homology (CH) domain 


3 . 3e-53 


190 . 2 


928 


WD40 


WD domain, G-beta repeat 


5 . 9e-48 


172 . 7 i 


929 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


3 . le-10 


37.4 


930 


Ribul_P_3_ep 
im 


Ribulose-phosphate 3 epimerase 
family 


7 . 2e-105 


3 £l . 8 


931 


Ribul P 3 ep 
im 


Ribulose-phosphate 3 epimerase 
family 


1 .2e-96 


334 .4 


936 


C2 


C2 domain 


2 . 2e-62 


220.7 


937 


NAP_family 


Nucleosome assembly protein 
(NAP) 


l.le-22 


84.6 


940 


abhydrolase 


alpha/beta hydrolase fold 


0 . 011 


3.1 


944 


Tropomyosin 


Tropomyosins 


3 .2e-07 


25.1 


948 


pkinase 


Eukaryotic protein kinase 
domain 


3.4e-75 


263.2 


949 


WD4 0 


WD domain, G-beta repeat 


1 . 8e-27 


104 .7 


950 


Acyl transfer 
ase 


Acyl transferase 


1 .6e-07 


38.4 
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SEQ ID 
NO; 


PFAM name 


DESCRIPTION 


p-value 


r r ATI 


951 


SAM 


SAM domain (Sterile alpha 
motif) 


0 .014 


14 . 5 


954 


GFO IDH MocA 


Oxidoreductase family 


1.3e-ll 


52.0 


955 


BTB 


BTB/POZ domain 


7e-22 


86 . 1 


956 


BTB 


BTB/POZ domain 


7e-22 


86 .1 


957 


CDP- 

OH P transf 


CDP-alcohol 

phosphatidyl transferase 


0 .053 


-22 . 2 


959 


rae 


Ras family 


2 . 4 e -97 


336 . 8 


960 


ras 


Ras family 


8.4e-43 


155.6 


961 


Acetyl transf 


Acetyl transf erase (GNAT) family 


1 . 2e-08 


42 2 


962 


adh short 


short chain dehydrogenase 


2 . 4e-3 1 


117 6 


963 


mutT 


Bacterial mutT protein 


5.6e-06 


26.2 


969 


IF-2B 


Tnihiafi rvn f ant" nr 0 anhnni f 
XliiLlaLliJll 1. a LUI 4. a UULUIJ. L. 

family 






970 


RNase_PH 


3 • exoribonuclease family 


9e-24 


92.4 


975 


WW 


riri ctomaiii 


5 . 7e-25 


96 . 4 


977 


trua 


PDZ domain (AI30 known as DHR 

r>r fiT.fJFi 

Ot UliVJf ; . 


3 . 6e-21 


83 . 7 


978 


Ribosomal_Ll 
7 


Ribosomal protein L17 


2.4e-20 


81.0 


979 


LIM 


ui-ri Qomaxn coutciiiiAiiy proteins 


D . oe-42 


152 . 8 


980 


Calsequestri 


Calsequestrin 


1. 7e-297 


1001.7 


982 


HSP20 


Han^ CI / a 1 r^Vi rr\fe (-all ^•n f am{ 1 ir 

nsyz vj / dxpjia ciyscaixin taniliy 


1 . 2e -10 


4 J . 2 


983 


OXidored__q6 


NADH ubiquinone oxidoreductase, 


4.Be-63 


222.9 


9B8 


TBC 


TBC domain 


2 . 2e-50 


180.8 


989 


TBC 


IOL aoiuaJ.Il 


2 . 2e-50 


180.8 


993 


t"T?Ma H-nt- anH 

o 


uA.iM>\ xntron enQonuciease 


0 . 0017 


-34.2 


994 


home obox 




4e - 18 


73.6 


997 




oxidoredur ha 




11.6 


1000 


rnito carr 


Mi t nfhnnrir"? si rarripr nrnh^i nc 

iiiLuwiuiiutxoi tatiici jj^. vlciiio 


9 . 7e~123 


til . z 


1001 


RA 


Ras association (RalGDS/AF-6 ) 
domain 


1.2e-15 


65.4 


1004 


DUF81 


Domain of unknown function 
DOT 81 


0 099 




1005 


actin 


Actin 


1 . 3e-174 


574 3 


1006 


actin 


Actin 


3 . le-130 


428.6 


1007 


cpn6 0 TCP1 


TCP-l/cpn6 0 chaperonin family 


3 # 7 e _ig5 


661.8 


1008 


TPR 


TPR Domain 


8 . le-44 


159.0 


1009 


zf-C2H2 


Zinc finger, C2H2 type 


3 .6e-61 


216.6 


1011 


zf -C2H2 


Zinc finger, C2H2 type 


3 . 6e- 61 


216 .6 


1012 


Zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


4.7e-15 


53.1 


1016 


tRNA- synt_2c 


tRNA synthetases class II (A) 


2 . 3e-15 


55 . 2 


101B 


RhoGAP 


RhoGAP domain 


1.6e-78 


274.3 


1022 


PGAM 


Phosphoglycerate tnutase family 


3.8e-18 


69 . 7 


1026 


HMG_box 


HMG (high mobility group) box 


8.4e-20 


79.2 


1027 


TBC 


TBC domain 


*7 Ip.iC 
/ . Jc"*D 


J. 0 4 . 3 


1028 


UQ con 


Ohicni i t" i n — pnn"i 1 10a t~ 1 nci PH7vmp 




i "7 n t 


1032 


PDZ 


PDZ rinmain f Al nn Irnnum a e DWD 
r u t-i uuu ia 1 11 \niou ajiuwii ao unK 

or GLGF) . 


U . (J X O 


-LO . J 


1034 


Hydrolase 


haloacid dehalogenase-like 
hydrolase 


2e-21 


84.6 


1037 


KRAB 


KRAB box 


4 .8e-05 


32.4 


1038 


Cation_ef f lu 

X 


Cation efflux family 


7.1e-42 


152.5 


1040 


ART 


NAD:arginine ADP- 
ribosyl transf erase 


4.7e-47 


169.1 


1042 


WD40 


WD domain, G-beta repeat 


1.9e-18 


74. 7 


1043 


zf-C2H2 


Zinc finger, C2H2 type 


3.7e-24 


93 . 7 


1045 


lectin^c 


Lectin C-type domain 


1.9e-28 


108 .0 


1046 


GlucoEamine_ 
iso 


Glucosamine - 6 -phosphate 
isomerase 


0. 00013 


-25.1 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1047 


ligase-CoA 


CoA-ligases 


4 .5e-80 


279 .4 


1043 


ig 


Immunoglobulin domain 


1.7e-09 


35 .6 


1050 


Ribosomal L2 
4e 


Ribosomal protein L24e 


2e-33 


124 .5 


1054 


Amidase 


Amidase 


4.3e-152 


518.7 


1055 


rrm 


RNA recognition motif. 


3.8e-26 


100.3 


1058 


annexin 


Annex m 


6 . 9e-44 


159.2 


1059 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


0.023 


-23.6 


1060 


homeobox 


Homeobox domain 


3 .2e-31 


117.2 


1062 


Acyltransfer 
ase 


Acyltransferase 


0.000*5 


10.5 


1064 


AMP-binding 


AMP-binding enzyme 


6. 6e-100 


345.3 


106S 


LRR 


Leucine Rich Repeat 


3 ,3e-14 


60 .6 


1066 


GTPljDBG 


GTP1/0BG family 


4 .8e-41 


141.8 


1071 


ig 


Immunoglobulin domain 


8 , 4e-48 


159.1 


1072 


PHD 


PHD- finger 


6. Be-07 


36.3 " 


1074 


DENN 


DENN (AEX-3) domain 


8. 3e-33 


121 . 5 


1075 


SCP 


SCP-like extracellular protein 


4 . 7e-41 


149.8 


1077 


OLF 


Olfactomedin-like domain 


2 .2e-6g 


234 .0 


1078 


mito carr 


Mitochondrial carrier proteins 


le-42 


149.3 


1079 


WD4 0 


WD domain, G-beta repeat 


6 . 2e-45 


162 . 7 


1087 


START 


START domain 


1 . 5e-4 8 


174 7 


1093 


DSPC 


Dual specificity phosphatase, - 
catalytic doma 


3 ,3e-63 


223.4 


1094 


GSHPx 


Glubathione peroxidases 


9 . 6e-41 


148 . 8 


1095 


DUF25 


Domain of unknown function 
DUF25 


2e-75 


264 .0 


1096 


. DUF2 5 


Domain of unknown function 
DUF25 


6e-75 


"262 .4 


1105 


Nitroreducta 
se 


Nitroreductase family 


1.3e-13 


58.6 


1106 


PTE 


Phosphotriesterase family 


1.3e-179 


610.1 


1107 


DAGKc 


Diacylglycerol kinase catalytic 
domain 


0 . 00049 


19.6 


1109 


ras 


Ras family 


1.3e-15 


40.7 


1115 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


9 . 7e-47 


168 . 7 


1116 


HMG14_17 


HMG14 and HMG17 


4 .4e-21 


83 . 5 


1117 


HMG14_17 


HMG14 and HMG17 


9 . 9e-12 


52 . 4 


1119 


FAA_hydrol a s 
e 


Fumarylacetoacetate (FAA) 
hydrolase fam 


2e-83 


290.6 


1120 


pkinase 


Eukaryotic protein kinase 
domain 


1 .4e-94 


327 . 6 


1123 


abhydrolase 


alpha/beta hydrolase fold 


9.2e-23 


89 . 0 


1129 


pro_isomeras 
e 


Cyclophilin type peptidyl- 
prolyl cis-tr 


2.2e-56 


197.1 


1131 


DnaJ 


DnaJ domain 


1 .6e-30 


114 .9 


1132 


WD40 


WD domain, G-beta repeat 


1.3e-19 


78.6 


1133 


WD40 


WD domain, G-beta repeat 


1.8e-15 


£4.9 


1134 


PH 


PH domain " 


0.0015 


17 . 8 


1136 


Adap comp eu 
b 


Adaptor complexes medium 
subunit family 


1.2e-256 


866.0 


1137 


Adap_comp su 
b 


Adaptor complexes medium 
subunit family 


2.5e-209 


708.8 


1139 


ras 


Ras family 


1.5e-86 


301.0 | 


1141 


pkinase 


Eukaryotic protein kinase 
domain 


9.4e-74 


258.4 


1152 


Acyltransfer 
ase 


Acyl transferase 


1.2e-05 


29.9 


1153 


IRS 


PTB domain ( IRS- 1. type) 


5,4e-55 


196 .1 


1155 


ig 


Immunoglobulin domain 


1.3e-31 


106.9 


1157 


Asparaginase 
_2 


Asparaginase 


6.4e-72 


252.3 


1159 


GMC_oxred 


GMC oxidoreductases 


4.7e-142 


485.3 


1160 


zf-ANl 


ANl-like Zinc finger 


0.00021 


27 .9 
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SBQ ID 
NO : 


PFAM NAME 




p-value 


PFAM 
SCORE 


1163 


1 1 nk.e r hi s t o 
ne 


1 Inker histone HI and H5 familv 


3 . 8e-14 


60 . 4 


1164 


DED 


Death effector domain 


3 .9e-05 


30.5 


1165 


IRS 


DTR ^ntnain fTPfl-T t vnp ) 
trlD uunicix.41 i a i\o -i- j ¥ ' 


2 . 6e-43 


157.3 


1166 


IRS 


DfR Horn » in (TP*?-! ^^/T^(»^ 
r ID uuuiuj.li x l vp 1 : / 


2 . 6e-43 


157 . 3 


1168 


SAM 


OnTl UUtllciXll \ ij tci i.ic a i jjna 

mot jl f ) 


0 . 04 


10 . 5 


1170 


abhydrolase 


alpha/beta hydrolase fold. 


0.098 


-7 . 5 


1174 


SAP 


SAP domain 


3.9e-10 


47 . 1 


1177 


PP2C 


rlOtciu pilUp^IlaLabc 


5 . 3e- 31 


112,5 


1178 


WD 4 0 


wd aomain/ u-Deta repeal 


4 . 7e-3 5 


129 9 


1180 


Ets 


Ets -domain 


1 . 8e- 09 


33.3 


1181 


Collagen 


Collagen triple helix repeat 
(20 copies) 


n nnni 

U . UUU1D 


OA 7 


1182 


TCL1_MTCP1 


TCLl/MTCPl ramiiy 




198 6 


11B4 


RasGEF 


RasGEF domain 


1.7e-88 


307.4 


1185 


mito^carr 


Mitochondrial carrier proteins 


1 . 5e-62 


217.3 


1187 


UPAR_LY6 


u-PAR/Ly-6 domain 


0.0042 


15.6 


1188 


Orn_DAP_Arg_ 
deC 


Pyx idoxa 1 - dependent 
decarboxylase 


6 . 2e-12B 


430 . 6 


1193 


Stathmin 


Stathmin family 


1.8e-90 


314.0 


1194 


Stathmin 


Stathmin family 


1 . 8e-90 


314 . 0 


1195 


Seel 


Seel family 


3 . 2e-183 


622 . 1 


1196 


pyr_redox 


Pyridine nucleotide -disulphide 
oxidoreducta 


3 . le-32 


111 . 8 


1197 


Glyco_transf 
8 


Glycosyl transferase family 8 


1 . 2e- 09 


45 . 5 


1202 


K_tetra 


K+ channel tetramerisation 
domain 


0 . 022 


-16.8 


1203 


adh_short 


short chain dehydrogenase 


Q "lei A C 


Xu* . J 


1206 


Ubie_methylt 
ran 


ubiE/C0Q5 methyl transferase 
family 




AIT A 


1208 


7 tm 3 


7 transmembrane receptor 


7 . 2e - 0 9 


29.0 


1209 


ank 


Ank repeat 


3 . 9e-15 


63.7 


1210 


vATP- 
synt__ALi 9 


Air syncnase — izt sauunn 


2 . 5e-128 


439 .7 


1212 


zf-C2H2 


Zinc finger, C2H2 type 


5.5e-17 


69.9 


1213 


ef hand 


Er nana 




3 7.4 


1219 


rrm 


RNA recognition moti£ . 


2 . le- 4 0 


147 . 7 


1220 


DUF6 


Integral membrane protein DUF6 


0.015 


21.5 


1222 


SCAN 


SCAN domain 


1 Co.TI 
i. . 3E- / X 


251 . 1 


1223 


G- gamma 


GGL domain 


3.6e-36 


129.5 


1227 


catalase 


Catalase 


0 


1 1 CD O 


1232 


PX 


PX domain 


*5 Id 1 C 

/. . ze- lb 


CA C 

bi . b 


1233 


PX 


PX domain 


2 . 2e- 15 




1236 


FCH 


Fes/CIP4 homology domain 




a a n 
. u 


1241 


Peptidase_M2 
0 


Peptidase family M20/M25/M40 


2e-63 


224.1 


1243 


WW 


WW domain 


0 . 044 


17.9 


1247 


UPF0006 


Metalloenzyme of unknown 
function UPF0006 


6 . 3e-61 


z lb . a 


1248 


Glycos trans 


Glycosyl transferases 


4 . 5e- 10 




1249 


ef hand 


BF hand 


4e - 1 1 


bu . si 


1254 


UQ_con 


Ubiquitin- conjugating enzyme 


2 . le- 73 


O C *7 1 


1255 


ras 


"D a a f ami 1 \ r 


2 2e-62 


220 . 7 


1256 


formyl trans 
f 


Formyl transferase 


4.9e-30 


108.3 


1259 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.3e-l3 


46.4 


1261 


DiHfolate re 
d 


Dihydrof olate reductase 


2.1e-63 


241.7 


1262 


G_glu_transp 
ept 


Gamma - glut amy 1 t ranspep t i da se 


l.Be-110 


380.4 


1263 


PAS 


PAS domain 


1 .3e-08 


36.9 


1265 


LRR 


Leucine Rich Repeat 


4 .2e-22 


86 .9 
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SEQ ID 

NO: 






p-value 


PFAM 
SCORE 


1266 


cr»D 


SCP-like extracellular protein 


6e-29 


108 . 0 


1267 


Ktetra 


K+ channel tetramerisation 

uuiua xii 


2.8e-27 


104.0 


1269 




Dae f a m 4 1 \r 
Kds JLalUXXy 


1 . 3e-8S 


297 . 9 


1275 


zf -C3HC4 


f inger ) 


4 . 2e-10 


37.0 


1276 


abhydrolase 


alpha/beta hydrolase fold 


5.4e-23 


69.8 


1277 


CkK/AXy v J 1 w i.dcC 


axpiia/ jjc L,a liyuxoiasc lulu 


5 . 6e-21 


63 .1 


1279 


l_X y LyaJ.il 


Tryps in 


4 . 4e-41 


132 . 0 


1280 


PBP 


Phosphat idylethanolamine- 


1 . 3e- 13 


58 . 7 


1285 


zf -C3HC4 


f inger ) 


5 . 6e- 14 


49 . 6 


1287 


ank 


Ank" renpa^ 


1 . 7e- 52 


187 . 8 


1294 


f n3 


r xujl uiiBU u ill uype x±x QOuialu 


0 , 026 


20 . 9 


1295 


GDP 


Guanylate -binding protein 


0.00026 


-70.0 


1296 


JrrlrZ 6_^iaual 


rPiF-zz/jDnF/MPzo/ciauain tamiiy 


6 . 9e-41 


149.3 


1297 


Rhodanese 


Rhodanese -like domain 


3.2e-14 


60.7 


1298 


T TM 

bin 


LIM domain containing proteins 


5 . 8e-21 


79.1 


1301 


rnaseA 


Pancreatic ribonucleases 


4 .9e-43 


145.2 


1 1 ft "7 


rai to^carr 


Mitochondrial carrier proteins 


2.1e-53 


186.0 


13 08 


WD4 0 


WD domain, G-beta repeat 


1.6e-17 


71.6 


i "a 1 n 

1,3 -LU 


UPAR Lib 


u-PAR/Ly-6 domain 


7.le-20 


75.5 


1313 


thiored 


Thioredoxin 


3.6e-05 


21.6 


13 14 


Aa__ trans 


Transmembrane amino acid 
transporter protein 


1.5e-67 


237.9 


1 3 16 


trypsin 


Trypsin 


4 . 4e-41 


132.0 


1320 


Ribosomal_Ll 
3 


Ribosomal protein L13 


3.9e-62 


219.8 


1327 


Armadillo_se 
9 


Armadillo/beta- catenin- like 
repeats 


0 .0054 


23.4 


1328 


KRAB 


KRAB box 


0 .052 


-5.6 


13 29 


rrm 


RNA recognition motif. 


2 . le-40 


147.7 


1330 


Bcl-2 


Apoptosis regulator proteins, 
Bel -2 family 


0.014 


-1.6 


1331 


PX 


PX domain 


2 .le-10 


48.0 


13 3 3 


KRAB 


KRAB box 


1 .8e-36 


134.6 


i "i *a a 


UPP_synthe ta 
se 


Putative undecaprenyl 
diphosphate synt 


2 .3e-89 


310.3 


iJJ5 


ujrr_syncneta 


Putative undecaprenyl 
diphosphate synt 


1 . 8e-59 


211.0 


133S 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


1.2e-3l 


118.6 


1337 


DSPc 


Dual specificity phosphatase, 
t-oLciJLytic uoiuo 


2 . 3e-12 


54 . 5 


133B 


TPR 




n a ft no i 


28 . 1 


1340 


metal thio 


Metallothionein 


0 .013 


20.3 


1341 


mutT 


uacteriai muti prouein 


5 . 8e-09 


36 . 5 


1343 


Band 41 


PERM domain (Band 4.1 family) 


1.3e-38 


122.5 


1344 


Kelch 


KclCli ulOCjlL 


1 . 4e-44 


161 . 5 


1345 


Ant* "i f r^P7« 


Antifreeze protein 


1 . 2e-10 


48 . 8 


1347 




3 -beta hydroxys tero id 

ucuyuiLKjeiiaoc/ lEOTTiera 


0 .086 


-177 .2 


1348 


BTB 


RTR/D07 ^nma ^ n 
DID/ C\J£i QOlUSin 


5 . 3e-28 


106 . 5 


1349 


DUF6 


Integral membrane protein DUF6 


0.033 


15 8 


1350 


myosin_head 


Myosin head (motor domain) 


0 


1088.7 


1352 


Nramp 


Natural resistance-associated 
macrophage pro 


1 .2e-202 


686 .6 


1353 


S_100 


S-100/ICaBP type calcium 
binding domain 


5.3e-23 


89.9 


1355 


DEAD 


DBAD/DEAH box helicase 


3.6e-65 


209.0 


1356 


C2 


C2 domain 


2.4e-15 


64 .4 


1357 


RBD 


Raf-like Ras -binding domain 


4.2e-57 


203.1 


1360 


zf-C2H2 


Zinc finger, C2H2 type 


7.4e-141 


481.4 


1361 


HMG14_17 


HMG14 and HMG17 


7.9e-40 


145. 7 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


13« 


SIS 


SIS domain 


3 .8e-30 


113 .6 


1363 


SIS 


SIS domain 


1 .3e-28 


108.5 


1364 


ig 


Immunoglobulin domain 


0 .00026 


"19 . 0 


1368 


K_tetra 


K+ channel tetramerisation 
domain 


1 .le-16 


68.9 


1371 


Collagen 


Collagen triple helix repeat 
(20 copies) 


2 .2e-113 


390 .1 


1372 


DnaJ 


DnaJ domain 


6 . 6e-36 


132 . 7 


1376 


KRAB 


KRAB box 


2 . le-38 


141 . 0 


1378 


ELM 2 


ELM 2 domain ~j 


2e-23 


91 .3 


1586 


thiored 


Thioredoxin 


1 .2e-23 


82 . 8 


1381 


ank 


Ank repeat 


2.3e-83 


290 .4 


1382 


BTB 


BTB/POZ domain 


3e-ll 


50 . 8 


13B3 


WD4 0 


WD domain, G-beta repeat 


1.6e-19 


78.3 


1384 


WD4 0 


WD domain, G-beta repeat 


6.3e-24 


92 . 9 


138-7 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


1 .le-09 


35.6 


1389 


Zf-C2H2 


Zinc finger, C2H2 type 


5.5e-50 


179.5 


1390 


z£-C2H2 


Zinc finger, C2H2 type 


2 .5e-85 


296 .9 


1393 


kinesin 


Kinesin motor domain 


7 . 8e-188 


637 .4 


1394 


zf-C2H2 


Zinc finger, C2H2 type 


1.2e-49 


178.4 


1398 


KRAB 


KRAB box 


5 . le-22 


86 .6 


1402 


bZIP 


bZIP transcription factor 


0 .035 


13 .1 


1405 


sugar_tr 


Sugar (and other) transporter 


0 .003 


-101 .5 


1406 


RhoGAP 


RhoGAP domain 


8.9e-47 


168.8 


1407 


rrm 


RNA recognition motif. 


le-35 


132.1 


1408' 


LRR 


Leucine Rich Repeat 


2 .le-13 


SB .0 


1409 


Nebulin repe 
at 


Nebulin repeat 


6e-54 


192.6 


1410 


ank 


Ank repeat \ 


1.6e-17 


71.6 • 


1412 


Ribosomal_L5 
_C 


ribosomal L5P family C- terminus 


8.2e-58 


205.5 


1415 


trypsin 


Trypsin 


4.7e-85 


270.4 : 


1416 


aminotran 1 


Aminotransferases class- I 


4.4e-05 


-91.2 


1417 


SI 


SI RNA binding domain 


1. 6e-C7 


33.1 


1419 


WD4 0 


WD domain, G-beta repeat 


2.2e-09 


44.6 


1422 


cadherin 


Cadherin domain 


8 .3e-42 


152.3 


1424 


SH3 


SH3 domain 


2.5e-80 


280.3 


1425 


PHD 


PHD- finger 


3 .2e-17 


70.6 


1426 


PHD 


PHD- finger 


3 .2e-17 


70 .6 


1427 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


le-37 


138.8 


1420 


helicase_C 


Helicases conserved C- terminal 
domain 


le-26 


102.2 


1429 


WD4 0 


WD domain, G-beta repeat 


3 . 9e-07 


37.2 


1430 


inositol_P 


Inositol monophosphatase family 


2 . 5e-10 


40.2 


1431 


mito carr 


Mitochondrial carrier proteins 


4 . 3e-83 


287.7 


1433 


Clq 


Clq domain 


2 . 9e-16 


66 . 2 


1434 


WD40 


WD domain, G-beta repeat 


1 . 6e-13 


58 . 3 


1435 


Inos-1- 
P_synth 


Myo- inos i tol - 1 -phosphate 
synthase 


7e-228 


770 .4 


1436 


rrm 


RNA recognition motif. 


1 . 4e-34 


12 8.3 


1438 




Immunoglobulin domain 


1 . 3e-12 


45.6 


1440 


G_Adapt_CT 


Gamma -adapt in, C-terminus 


3 . 4e-67 


236.7 


1441 


G_Adapt_CT 


Gamma- adapt in, C- terminus 


3 . 4e-67 


236 . 7 


1443 


Kelch 


Kelch motif 


n n n n 1 1 


9 ft 7 


1446 


ARID 


ARID DNA binding domain 


1.8e-2l 


84 .7 


1447 


zf -C2H2 


Zinc finger, C2H2 type 


9.4e-28 


105.5 


1448 


AMP-binding 


AMP-binding enzyme 


2 .6e-07 


-145.1 


1451 


rrm 


RNA recognition motif. 


6 . 5e-2l 


62. 9 


1454 




Immunoglobulin domain 


5 .6e-44 


146.7 


1455 


Sialyltransf 


Sialyltransf erase family 


5 .4e-21 


83 .2 


1460 


Aldose_epim 


Aldose 1-epimerase 


1 .9e-35 


131.2 


1461 


C2 


C2 domain 


4e-18 


73 .6 


1470 


TIG 


IPT/TIG domain 


3 .le-19 


77 .3 


1472 


PseudoU_synt 


RNA pseudouridylate synthase 


4.3e-16 


66.9 
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SEQ ID 
NO : 


rrAM .NAME 


DEoCKI PT1U.N 


p-value 


PFAM 
SCORE 




fa_2 








X* 1% 




de^njn \ac.a- s ) aomain 


1 . 3e- 44 


161 . 6 


1475 


Cation_ef f lu 


Cation efflux family 


4.6e-49 


176.4 


1477 


TBC 


TBC domain 


8e-47 


169.0 


14 78 




RNA recognition motif. 


2e-21 


84 . 6 


1480 




Immunoglobulin domain 


5 .5e-06 


24 .3 


1484 


Telo_nina al 
pha 


Telomere -binding protein alpha 
subuni 


0 . 028 


-225 .9 


1485 


zf -C2H2 


Zinc finger, C2H2 type 


1 . 8e-68 


240.9 


1486 


pkinase 


Eukaryotic protein kinase 
domain 


9. 5e-13 


49.9 


1488 


helicase_C 


Helicases conserved C- terminal 
domain 


1.4e-15 


65.2 


1489 


DUF89 


Protein of unknown function 
DUF89 


0.079 


-132.4 


1490 


ECH 


Enoyl-CoA hydra t as e/isomerase 
family 


5. 2e-41 


149.7 


1491 


guanylate_cy 
c 


Adenylate and Guanylate cyclase 
catalyt 


5 . 9e-46 


166.1 


1492 


LRR 


Leucine Rich Repeat 


3 . 4e-19 


77.2 


1495 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


7 . le-10 


36.3 


1497 


pkinase 


Eukaryotic protein kinase 
domain 


le-22 


85.8 


jlduu 


CUT 

snJ 


aril aomain 


9 . 3e-05 


27.2 


1502 


homeobox 


Homeobox domain 


0.0B4 


13.8 


15 03 


homeobox 


Homeobox domain 


0 , 0B4 


13 . 8 


1505 


EOF 


EGF-like domain 


2 . 7e~23 


90 . B 


1506 


UCH-2 


Ubiquitin carboxyl -terminal 
hydrolase family 


2 . 7e-21 


84 .2 


1508 


Peptidase M2 
U 


Peptidase family M20/M25/M40 


2 . 8e-28 


101 . 8 


1 C 1 1 


DV 


PX domain 


1 . 9e-ll 


51. 5 


Id xc 


C, , 1 4? -i i- r» A 

ouiracase 


Sulf atase 


2 . 8e-35 


130 . 7 


1516 


Syntaxin 


Syntaxin 


0.011 


-62.3 


151 B 


am i no t r an_3 


Aminotransferases class- III 
py r i doxa 1 - pho 


9 . 7e-106 


305.6 






Immunoglobulin domain 


0 . 075 


11. 0 


1521 


RA 


Ras association (RalGDS/AF-6 ) 
domain 


0.013 


13.3 


1523 


RhoGAP 


RhoGAP domain 


2 .5e-05 


18.7 


1528 


WD 4 0 


WD domain, G-beta repeat 


5 .4e-24 


93 . 1 


153 5 


IMS 


impB/mucB/samB family 


7 . 8e-95 


328.5 


153 8 


FYVE 


FYVE zinc finger 


3 . 2e-27 


101.5 


1539 


DAGKc 


Diacylglycerol kinase catalytic 
domain 


6e-07 


36.5 


4. U 


ucuiar axe 


Ocular albinism type 1 protein 


0 


1184 . 7 


J. D DJ 




sap domain 


6e-06 


33 . 2 


1654 


Amino_oxidas 
e 


Flavin containing amine oxidase 


3.2e-43 


157.0 


1655 


.hit. lno^oxi aa s 


Flavin containing amine oxidase 


3 . 2e-43 


157. 0 


1656 


RhoGEF 


RhoGEF domain 


1.4e-24 


95.1 


1657 


mm noKi 


UTPase of unknown function 


0 . 0011 


-45 . 5 


1659 


to- 2 


Ubicruit-in carhnvvl - t-pvmi na 1 
hydrolase family 


2 . 5e-ll 




1660 


actin 


Actin 


6.6e-21 


69.9 


1661 


BAH 


BAH domain 


1.7e-82 


287.5 


1662 


vwa 


von Willebrand factor type A 
domain 


0 


1909.4 


1663 


WD40 


WD domain, G-beta repeat 


1.4e-67 


237.9 


1667 


zf-C2H2 


Zinc finger, C2H2 type 


1.3e-93 


324 .4 


1669 


Noll_Nop2_Su 
n 


NOLI /N0P2/ sun family 


1.3e-23 


84.3 


1671 


SH2 


Src homology domain 2 


5.4e-15 


46.9 
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SEQ ID 


DC 1 AM KFAMl? 
rf AM IM/VIC 




p value 


PFAM 
SCORE 


1 o / £ 


^ 

cnromo 


1 rhrnmn' 1 OHR rnn\3 1" i n 

Organization Modifier) 


2 . le- 18 


67 . 7 


1674 


7 f - rrcH 


Zinc finger C-x8-C-x5-C-x3 -H 
type 


0 .0025 


17 . 6 


1676 


Glyco Hycbro 
47 " 


Glycosyl hydrolase family 4 7 


1 . 8e-187 


636 .2 


1677 


Glyco hydro 
4 7 


Glycosyl hydrolase family 4 7 


4 .5e-74 


259 . 5 


16 80 


WD40 


WD domain, G-beta repeat 


1 . le-27 


105 . 5 


16 81 


WD40 


WD domain, G-beta repeat 


1 . le-27 


105 . 5 


16 8 3 




GTPase of unknown function 


1 . 8e-78 


274 . 1 


1691 


-IE!- 1 


PNA rprncmiHnn mntif 


1.8e-37 


137.9 


16 92 




RNA recognition root if . 


1 . Be-37 


137 . 9 


1693 


AAA 


ATPases associated with various 

CcJ.JLUJ.clJL aLL 


1.3C-81 


284 .5 


ICQ! 


) , 

Ferr i c reduc 

u 


fClilC JLUUULLaoC 


8 .4e- 82 


285 . 2 


lb jo 


reiriric. xeuut. 


transmembrane com 


3 . 5e-53 


190 . 1 


1699 


zf-C2H2 


Zinc finger, C2H2 type 


4 .4e-34 


126.6 


17 00 


ar£ 


AT^P-r^hoQvlafi on f actnf familv 


9e-19 


75 . 8 


1702 


GTP__EFTU 


Elongation factor Tu family 


0 .014 


11.4 


1703 


SCAN 


9\.rU« UUtllttAH 


1 . 8e-54 


194 . 4 


x / u / 


pkinas 6 


Hftmain 


1 . 2e-88 


307.9 


1709 


WD40 


WD domain, G-beta repeat 


0 . 0035 


24 . 0 


1710 


LRR 


Leucine Rich Repeat 


1.2e-30 


115.3 


1 ^ 1 1 


WW 


Wit Lluiud J>I1 


7 . 6e- 12 


52 . 8 


1712 


ank 


Ank repeat 


4.2e-34 


126.7 


1713 


ZI -CCCH. 


zinc ringer t-xo-t-xo-t-xj-n 
type 


Z. . DC- U3 


-3 Q -3 
JO .J 


1714 


2 1 ~ LtA-n 


Ziinc linger t-xo - w-xd -l-xj -n 
type 


2 . 6e- 0 9 


3 8.3 


1 /lb 


ras 


Ras family 


4 4e-41 


149 . 9 


1718 


rlrlvj DOX 


nciva ^niyii iiwjjj ii j. uy yiuupi uua 


8 . 3e-21 


B2 . 6 


1719 


TBC 


TBC domain 


1 . le-45 


165.2 


1721 


HLH 


He ITix— loop -helix DNA-bindmg 

UUDIA 


9 2e - 1 0 


A C Q 


1723 


dsrm 


Double- stranded RNA binding 


2 .9e-05 


30.9 


1 i 4% 


— — 

RrnaAD 


dime thylases 


0 . 045 


9.2 


172 5 


CIDE-N 


CIDE-N domain 


5 . 9e-40 


146 .2 


1726 


HAT 


HAT (Half-A-TPR) repeats 


2 .9e-44 


160 .5 


1728 


ef hand 


EF hand 


5 .le-20 


79.9 


1733 


Hist deacety 
1 ~ 


Histone deacetylase family 


1 . 7e-l04 


360.6 


1735 


LRR 


Leucine Rich Repeat 


4 .6e-34 


126,. 6 


1739 


PI-PLC-X 


Phospha t i dy 1 inos i t o 1 - spe c i f i c 
phpspholipase 


0 .0023 


16.1 


1743 


ras 


Ras family 


3 .7e-10 


-21.3 


1744 


ras 


Ras family 


3 .7e-10 


-21.3 


1745 


RasGEF 


RasGEF domain 


3 .2e-49 


176.9 j 


1746 


adh_short 


short chain dehydrogenase 


7 .le-08 


34.6 


1751 


zf-C2H2 


Zinc finger, C2H2 type 


9e-3 9 


142.2 


1754 


fn3 


Fibronectin type III domain 


5.5e-l01 


348.9 


1756 


zf-C2H2 


Zinc finger, C2K2 type 


6.3e-93 


322.1 


1758 


rrra 


RNA recognition motif: . 


0.017 


21.2 I 


1760 


Nop 


Putative snoRNA binding domain 


6.1e-95 


328.8 


1761 


Nop 


Putative snoRNA binding domain 


6.1e-95 


328.8 


1765 


MMRjiSRl 


GTPase of unknown function 


6.4e-41 


149.4 


1769 


CN_hydrolase 


Carbon -nitrogen hydrolase 


3e-06 


-43 .9 


1775 


ank 


Ank repeat 


4.1e-07 


37.1 


1779 


Oxysterol_BP 


Oxysterol- binding protein 


4.7e-56 


199.6 


1781 


RhoGEF 


RhoGEF domain 


1.6e-23 


91.6 


1784 


RhoGBF 


RhoGEF domain 


1.6e-23 


91.6 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1785 


rrm 


RNA recognition motif. 


6.4e-14 


59.7 



TRADOCS: 141 6227.1 (%CRN0 1 1.DOC) 
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TABLE 5 



SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


1 


1-21 


0.991 


0.955 


2 


1-31 


0.995 


0.944 


3 


1-33 


0.949 


0.736 


4 


1-19 


0.970 


0.951 


5 


1-26 


0.971 


0.863 


6 


1-26 


0. 971 


0.863 


7 


1-26 


0.971 


0.863 


8 


1-26 


0.971 


0.863 


9 


1-46 


0.982 


0.901 


10 


1-21 


0 . 991 


0 .955 


11 


1-23 


0.989 


0 .899 


12 


1-25 


0.955 


0.803 


13 


1-18 


0.932 


0.625 


14 


1-18 


0 . 938 


0 . 876 


15 


1-25 


0 . 941 


0 .811 


16 


1-17 


0 .972 


0 .939 


17 


1-27 


0 . 964 


0 .777 


18 


1-16 


0 .914 


0.657 


19 


1-19 


0 .953 


0.840 


20 


1-20 


0.935 


0.701 


21 


1-22 


0 .974 


0 .850 


22 


1-33 


0.961 


0 .895 


23 


1-19 


0 .991 


0 .959 


24 


1-31 


0.995 


0 .944 


25 


1-22 


0.976 


0 .935 


2£ 


1-27 


0 .996 


0 .928 


27 


1-24 


0.953 


0 .739 


28 


1-21 


0 . 906 


0.688 


29 


1-31 


0.986 


0.841 


30 


1-28 


0.980 


0 .893 


31 


1-19 


0. 993 


0.976 


32 


1-22 


0.998 


0.909 


35 


1-33 


0.949 


0.736 


36 


1-33 


0 . 949 


0 .73 6 


46 


1-19 


0.970 


0.951 


67 


1-25 


0.968 


0.848 


71 


1-18 


0.949 


0.845 


72 


1-30 


0.991 | 


0.919 


75 


1-29 


0.958 


0.854 


88 


1-20 


0.986 


0.945 


94 


1-33 


0.994 


0.943 


97 


1-46 


0.964 


0.595 


103 


1-49 


0.983 


0 .570 


108 


1-26 


0.978 


0 .885 


111 


1-23 


0.989 


0.899 


126 


1-25 


0.955 


0.803 


129 


1-19 


0.963 


0 .918 


138 


1-29 


0.971 


0.844 


143 


1-18 


0.914 


0.628 


148 


1-20 


0.969 


0.904 


156 


1-25 


0.941 


0 .811 


158 


1-22 


0.979 


0.927 


160 


1-17 


0.972 


0.939 


161 


1-48 


0 .903 


0.571 


162 


1-25 


0.937 


0.729 


168 


1-16 


0.939 


0.826 


171 


1-27 


0.964 


0.777 


178 


1-21 


0.945 


0.B25 


180 


1-27 


0.981 


0.941 


187 


1-28 


0.982 


0.936 


190 


1-19 


0.953 


0.840 


196 


1-22 


0.975 


0.916 


197 


1-22 


0.963 


0.936 
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SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


M&xS {MAXIMUM 
SCORE) 


MeanS {MEAN 
SCORE) 


199 


1-20 


0.935 


0 .701 


200 


1-23 


0 .977 


0 .773 


206 


1-30 


0.984 


0 . 890 


207 


1-19 


0.990 


0.924 


208 


1-22 


0.974 


0.850 


210 


1-40 


0.940 


0 .670 


211 


1-28 


0.971 


0 .849 


216 


1-24 


0.986 


0.956 


218 


1-33 


0.961 


0 .895 


219 


1-19 


0.970 


0 .871 


221 


1-19 


0.904 


0 .553 


222 


1-21 


0 .917 


0 .555 


230 


1-19 


0 .991 


0 .959 


231 


1-26 


0 . 953 


0 .800 


232 


1-25 


0.988 


0 .826 


239 


1-23 


0 . 969 


0 . 828 


240 


1-17 


0 .982 


0 .955 


241 


1-17 


0 .982 


0 . 955 


245 


1-30 


0 .970 


0 . 722 


248 


1-22 


0.976 


0 . 935 


249 


1-23 


0 . 968 


0 . 940 


252 


1-18 


0 .971 


0 . 923 


261 


1-24 


0 .883 


0 .587 


265 


1-18 


0 .939 


0 .868 


272 


1-24 


0 . 953 


0 . 739 


283 


1-21 


0.906^ 


0 . 6&8 


284 


1-29 


0 .997 


0 . 854 


290 


1-31 


0.986 


0 . 841 


302 


1-28 


0 .980 


0 . 893 


304 


1-16 


0 .907 


0 . 635 


312 


1-19 


0. 993 


0 . 976 


313 


1-17 


0 . 930 


0 .753 


323 


1-22 


0 . 998 


0 . 909 


324 


1-17 


0 .582 


0 .954 


328 


1-19 


0 .971 


0 .865 


329 


1-22 


0.963 


0 . 924 


330 


1-33 


0. 978 


0.841 


331 


1-24 


0 .920 


0 .712 


332 


1-24 


0 . 975 


0 .881 


333 


1-19 


0 . 984 


0 .941 


334 


1-20 


0. 899 


0.567 


335 


1-27 


0 . 942 


0.813 


336 


1-20 


0.952 


0 .850 


337 


1-38 


0 .942 


0 .653 


338 


1-27 


0 . 973 


0 .772 


339 


1-36 


0.979 


0 . 804 


340 


1-27 


0.888 


0 . 597 


343 


1-19 


0.971 


0.865 


344 


1-22 


0 .994 


0 . 928 


345 


1-17 


0.966 


0 .687 


346 


1-19 


0.93 6 


0.822 


347 


1-22 


0 .963 


0 .924 


349 


1-24 


0.982 


0 .966 


351 


1-21 


0 .918 


0 .815 


352 


1-31 


0 .988 


0 .912 


354 


1-31 


0.974 


0.839 


355 


1-29 


0.932 


0.632 | 


356 


1-15 


0.994 


0.969 


357 


1-33 


0.935 


0.726 


3S0 


1-27 


0.938 


0.827 


351 


1-25 


0.954 


0.674 


362 


1-22 


0.929 


0.788 


363 


1-21 


0. 881 


0 .715 


364 


1-33 


0 . 978 


0 .841 


365 


1-33 


0.978 


0.841 
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SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS {MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


366 


1-21 


0.916 


0 . 820 


367 


1-19 


0.936 


0 . 822 


368 


1-29 


0 . 972 


0.874 


370 


1-24 


0.920 


0 . 712 


371 


1-24 


0 . 961 


0 . / /3 


372 


1-27 


0 . 919 


0.768 


373 


1-19 


0.986 


0.945 


375 


1-32 


0 .994 


0.932 


376 


1-34 


0 .987 


0.810 


377 


1-17 


0 . 995 


0 . 950 


378 


1-49 


0 . 971 


0 . 749 


380 


1-20 


0.968 


0 . 874 


381 


1-20 


0.928 | 


0.782 


382 


1-19 


0.986 


0 . 934 


383 


1-28 


0 .965 


0 . 829 


384 


1-39 


0.970 


0.551 


386 


1-24 


0.975 


0 .881 


388 


1-30 


0.989 


0 . 868 


389 


1-19 


0.984 


0.941 


390 


1-26 


0.971 


0.782 


392 


1-20 


0.981 


0 .900 


393 


1-16 


0.968 


0 .890 


394 


1-23 


0.937 


0.701 


397 j 


1-22 


0.985 


0. 854 


399 


1-46 


0.977 


0.698 


401 


1-20 


0 .899 


0 . 567 


402 


1-22 


0.967 


0 .931 


403 


1-27 


0.992 


0.934 


404 


1-19 


0.991 


0 . 973 


405 


1-23 


0.994 


0.921 


407 


1-35 H 


0.987 


0. 658 


408 


1-39 


0.976 


0 .551 


409 


1-33 


0 .897 


0 .570 


410 


1-25 


0.990 


0 .962 


411 


1-38 


0 .977 


0 . 827 


412 


1-20 


0 .944 


0 .768 


413 


1-20 


0.988 


0 .965 


414 


1-46 


0.993 


0 . 638 


415 


1-23 


0 .981 


0 . 940 


417 


1-29 


0.941 


0 . 672 


418 


1-20 


0.952 


0.850 


419 


1-19 


0 .986 


0 . 967 


420 


1-29 


0.965 


0 . 861 


421 


1-22 


0 .889 


0 . 785 


422 


1-48 


0 . 982 


0.862 


424 


1-19 


0 . 979 


0 . 933 


428 


1-38 


0. 942 


0 . 653 


430 


1-18 


0 .947 


0 . 595 


432 


1-33 


0.957 


0 .789 


433 


1-26 


0 . 979 


0 . 904 


434 


1-27 


0 . 962 


0 . 777 


435 


1-24 


0. 998 


0 . 977 


436 


1-27 


0.973 


0 .772 


443 


1-15 


0 . 966 


0 . 940 


448 


1-36 


0.979 


n on/; 


453 


1-41 


0.958 


0.609 


455 


1-33 


0.943 


0.606 


457 


1-27 


0.888 


0.597 j 


462 


1-16 


0.925 


0.681 


486 


1-27 


0.972 


0.845 


495 


1-24 


0.917 


0.636 


498 


1-26 


0.993 


0.890 


505 


1-20 


0.976 


0.926 


507 


1-17 


0.966 


0.687 


510 


1-23 


0.930 


0.593 
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SEQ ID NU: 


PUS III UN Ur 
CTr.MIT TN 1MTND 




ncauo imcJ\li 
SCORE \ 






0.930 


0 . 593 


512 


1-23 


0 . 930 


0 .593 


S15 


1-18 


0.978 


0 . 956 


523 


1-19 


0 . 936 


0 . 822 


3 £ 3 


1-22 


0 . 963 


0 . 924 


54 S 


1-24 


0.982 


0 . 966 


550 


1-30 


0 . 933 


0 . 713 


552 


1-21 


0 . 973 


0 . 912 


CCA 


1-23 


0.969 


0.784 


571 


1-21 


0.918 


0 . 815 


574 


1-31 


0.988 


0 . 912 


5 3 0 i 


1-39 


0 . 925 


0.556 




i _ i 

i ji 


0.974 


0.839 


COR 
D Jo 


X - 4 ZJ 


0.932 


0.632 




1-29 


0 932 


0.632 


ci n 
biu 


X - d, 1 


0 99 0 


0.948 


O 1 
bZl 


VIC 

JL - -L D 




0.969 


623 




V . 3 J D 


0 726 


653 


1-2 7 


u . y j a 


n cm 
u - / 


668 


1-22 


u . y ^y 


v . / o o 


677 




n qa ft 


0 807 


635 


1-21 


0.881 


n tic 
u. /Id 


699 


1 *j o - 
1- i^S 


u . y /s 


"7T aid 
<J . o x o 


702 


1-31 


u . y o e 


n n q q 
u . o y o 


707 


1-16 


0 . 880 


n ceo 


713 


1-25 


U . job 


0 . 743 


716 


T 1 Q 


u . y J © 


0 822 




X - ^ U 


0 961 


0 824 




1 _ n q 

x- ^ y 


0 972 


0 . 874 


73 5 




0 903 


0 598 


/4b 


1-14 


0 916 


0 73 0 


7i7 


X - 


0 96 5 


0 876 


748 


i- .^y 


0 96 8 


0 785 




1 "OA 
x - ^4 


0 96 1 


0 773 


767 


1-27 


n qi q 
u . y ij 


u . / o 0 


768 


i 11 

X- -3-3 


0 900 


0 . 585 


771 


X - 




0 702 


/ / y 


x - xy 




0 . 945 


797 


1- 19 


0 . 944 


n 7C Q 


*7 on 

/JO 




0 900 


0 .568 




1-17 
x - X / 


0 995 


0 . 950 


ft7 7 


1-49 


0.971 


0 . 74 9 


o d q 
o ** o 


1-20 


0 . 968 


0 . 874 


oca 
O O *i 


1-20 


0 . 928 


0.782 


86 6 


1-19 


0 . 986 


0 . 934 


873 


1-23 


0 . 948 


0 .886 


8 81 


1-28 


0 . 965 


0 . 829 


887 


1-39 


0 . 970 


0 . 551 


927 


1-30 


0 . 989 


0.868 


934 


1-48 


0 . 988 


0 . 777 


93 9 


1-39 


0 . 994 


0.889 


944 


1-26 


0 . 971 


0.782 


950 


1-29 


0 . 957 


0.845 


963 


1-20 


0 . 981 


0 . 900 


964 


1-20 


0 . 886 


0 . 558 


973 


1-16 


0.968 


0. 890 


980 


1-34 


0.961 


0.749 


981 


1-20 


0.953 


0.822 


984 


1-12 


0.938 


0 . 780 


1015 


1-22 


0. 985 


0 . B54 


1040 


1-46 


0.977 


0.698 


1052 


1-18 


0.969 


0.B42 


1059 


1-20 


0.927 


0.867 


1065 


1-33 


0.983 


0.918 


1069 


1-22 


0.993 


0.935 
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c pn t n wo • 
o£-y iu nu . 


QTfSNATi IN AMINO 
ACID SEQUENCE 




Mea.no 


1075 


1-27 


0 .992 


0 . 934 


1080 


1-19 


0 . 93 1 


0 . 829 


1092 


1-19 


0 .991 


0 . 973 


1094 


1-46 


0.992 


0 . 653 


1095 


1-30 


0 . 974 


0 . 929 


1105 


1-23 


0.994 


0 . 921 


1123 


1-35 


0 .987 


0 . 658 


1138 


1-32 


0 . 954 


0 . 613 


1140 


1 - 3 B 


0.989 


0 78 9 


1142 


1-33 


0.897 


0 570 


1152 


1-25 


0.990 


ft Q£0 

\j . y t> z 


1170 


1-3 8 


0 . 977 


ft Q "> 7 

u . oz / 


1176 


i - o ft 


ft Q/ A 


0 . 76 B 


1187 


i - z u 


U .300 


ft QCC 

u . ybD 


1189 




n qch 

U .7b / 


ft Q "> Q 




1- 't b 


u . y y j 


0.638 


1193 


1 - 1 o 


U . jZ3 


0.710 


1197 


1 _ 9 O 
1 - Zy 


0.985 


0 . 853 




1 _ 7 1 


ft 001 


0 . 940 






0 . 941 


0 . 672 


1245 


1 1 Q 
1-17 


0.986 


0.967 


J-<ojO 


1 0 Q 

JL - Z y 


0.965 


0.861 


1265 


1-ZZ 




0.785 


1266 


inn 
1- z u 


0 . 944 


0.809 


1276 


1 Aft 
1— «t o 


u . y 0 z 


0.862 


iz 7^: 




0 . 979 


0.933 


1296 


1-71 
l~il 


0 . 984 


0.944 


12 97 


1-19 


ft Q Q A 
U . y a ft 


ft QC1 


1332 


1-3 8 


0 94 2 


ft ezci 

U .D3J 


13 58 


1-18 


ft q a *7 
u . yfi / 


0 . 595 


1371 


1-33 


0 957 


ft TOO 

u . /ay 


1380 


1-26 


0.979 


ft QftA 
U . i»Ufl 


1397 


1-2 7 


ft aco 
u . ?bz 


0.777 


13 99 


1-23 


u . yy / 


0.960 


14 04 


1 - Z *± 


u . yyc 


0 . 977 


1410 


1-13 


ft QAC 

u . y%c 


0 . 845 


14 14 


1-24 


0 . 913 


ft CCfl 
V • JOO 


1415 


1-19 


ft QPO 


ft QTQ 

u . y zy 


1416 


1-12 


ft 1 


ft OQ1 

u . oy 1 


1418 


l — J u 


ft Q"3 "3 

v . yj J 


U . 3D J 


1420 


1 -zu 


ft Q Q 1 
U . Obi 


0 . 561 


1421 


1-19 


ft Qon 


ft QCfi 

u . yoo 


1423 


1 — 1 / 


n QC Q 1 


0 . 863 


1424 


1-21 


0.885 




1425 


1-24 


0.913 


0 . 588 


1426 


1-24 


0.913 


0 . 588 


1428 


1-25 


0.967 


ft ft QQ 

u . 0 y y 


1430 


1-34 


0.977 


0 . 819 


1431 


1-28 


0.979 


0 923 


1432 


1-36 


0.957 


0 613 


1433 


1-32 


0 .921 


0 753 


1434 


1-39 


0.983 


0 621 


1435 


1-25 


0 .910 


0.631 


1436 


1-42 


0.968 


0 868 


1437 


1-22 | 0.998 


0.980 


1442 


1-20 


0.918 


0 . 753 


1448 


1-12 


0.931 


0 .891 


1462 


1-18 


0.968 


0 . 888 


1490 


1-20 


0.881 


0.561 


1518 


1-17 


0.968 


0 .863 


1525 


1-21 


0.885 


0.591 


1547 


1-28 


0 .974 


0.891 


1561 


1-25 


0.967 


0.899 


1580 


1-17 


0.923 


0.824 


1593 


1-28 


0. 979 


0.923 
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SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


1596 


1-16 


0.929 


0.709 


1601 


1-3* 


0.957 


0.613 


1606 


1-22 


0. 979 


0.831 


1607 


1-20 


0.974 


0.770 


1608 


1-32 


0.921 


0.753 


1614 


1-33 


0.969 


0 .829 


1616 


1-20 


0.959 


0 .869 


1625 


1-39 


0. 983 


0 .621 


1632 


1-25 


0.910 


0.631 


1636 


1-33 


0 . 897 


0 .591 


1639 


1-42 


0.988 


0.868 


1645 


1-20 


0.927 


0 .568 


1647 


1-17 


0.923 


0.742 


1648 


1-22 


0.998 


0.980 



TRADOCS: 14 16234. 1 (%CR%01 !.DOC) 
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TABLE 6 



SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: Of 


of contig 


NO: 


docket number_ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




1 


1787 


3573 


5359 


784CIP2_1 


1103 


2 


1788 


3574 


5360 


784CIP2 2 


2S73 


3 


1789 


357.5 


5361 


7 84CIP2_3 


4117 


4 


1790 


3576 


5362 


784CIP2 4 


5556 


5 


1791 


3577 


5363 


784CIP2_5 


5562 


6 


1792 


3578 


53 64 


784CIP2_6 


5562 


7 


1793 


3579 


5365 


7 84CIP2 7 


5562 


8 


1794 


3580 


5366 


784CIP2_8 


5562 


9 


1795 


3581 


5367 


784CIP2_9 


5563 


10 


1796 


3582 


5368 


784CIP2_10 


5564 


11 


1797 


3583 


5369 


784CIP2JL1 


5565 


12 


179B 


3584 


5370 


784CIP2_12 


5689 


13 


1799 


3585 


5371 


784CIP2JL3 


5729 


14 


1800 


3586 


5372 


784CIP2JL4 


5745 


15 


1801 


3587 


5373 


784CIP2JL5 


5777 


16 


1802 


3588 


5374 


784CIP2JL6 


5777 


17 


1803 


3589 


5375 


784CIP2JL7 


5789 


18 


1804 


3590 


5376 


784CIP2JL8 


5792 


19 


1805 


3591 


5377 


784CIP2JL9 


5B04 


20 


1806 


3592 


5378 


7B4CIP2 20 


5805 


21 


1807 


3593 


5379 


784CIP2_21 


5805 


22 


1808 


3594 


5380 


784CIP2 22 


5844 


23 


1809 


3595 


5381 


784CIP2_23 


5844 


24 


1810 


3596 


5382 


784CIP2_24 


5850 


25 


1811 


3597 


5383 


784CIP2__25 


5867 


26 


1812 


3598 


5384 


784CIP2_26 


5973 


27 


1813 


3599 


5385 


784CIP2_27 


5995 


28 


1814 


3600 


5386 


784CIP2_28 


5995 


29 


1815 


3601 


5387 


784CIP2_29 


6-005 


30 


1816 


3602 


5388 


784CIP2 30 


6007 


31 


1817' 


3603 


5389 


784CIP2_31 


6007 


32 


1818 


3604 


5390 


784CIP2 32 


6009 


33 


1819 


3605 


5391 


7 84CIP2JJ3 


6012 


34 


1820 


3606 


5392 


784CIP2_34 


6015 


35 


1821 


3607 


5393 


784CIP2_35 


6016 


36 


1822 


3608 


5394 


784CIP2_36 


6016 


37 


1823 


3609 


5395 


784CIP2 37 


6018 


38 


1824 


3610 


5396 


7B4CIP2JJ8 


6018 


39 


1825 


3611 


5397 


7B4CIP2_39 


6018 


40 


1826 


3612 


5398 


7B4CIP2__40 


6023 


41 


1827 


3613 


5399 


784CIP2 41 


6070 


42 


1828 


3614 


5400 


784CIP2 42 


6081 


43 


1829 


3615 


5401 


784CIP2_43 


6089 


44 


1830 


3616 


5402 


784CIP2_44 


6118 


45 


1831 


3617 


5403 


784CIP2_45 


6118 


46 


1832 


3618 


5404 


784CIP2 46 


6130 


47 


1833 


3619 


5405 


784CIP2_47 


6177 


48 


1834 


3620 


5406 


784CIP2_4 8 


6189 


49 


1835 


3621 


5407 


784CIP2_49 


6191 


50 


1836 


3622 


5408 


784CIP2_50 


6204 


51 


1837 


3623 


5409 


784CIP2_51 • 


6204 


52 


1838 


3624 


5410 


784CIP2_52 


6284 


53 


1839 


3625 


5411 


784CIP2_53 


6367 


54 


1840 


3626 


5412 


784CIP2_54 


6436 


55 


1841 


3627 


5413 


784CIP2 55 


6442 


56 


1842 


3628 


5414 


784CIP2_56 


6445 


57 


1843 


3629 


5415 


784CIP2 57 


6457 


58 


1844 


3630 


5416 


784CIP2_58 


6458 


59 


1845 


3631 


5417 


784CIP2 59 


6458 
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ct?o ID NO • 

ODy XL/ AiV^ . 






SEQ ID 


Priority 


SEQ ID 


of full- 


NO : of 


r— i f rnnhin 
i_^X i>UiJLXij 




uoc\cl nuifiDer 


NO: in 


length. 


full- 


nuf 1 pot - i 




LUiiCo L/OI 1U X ng 


U . S . S .N . 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


u y/ *±oo , /ZD 


sequence 


peptide 




sequence 


priority 






sequence 






application 




60 


1846 


3632 


5418 


784CIP2_60 


6462 


61 


1847 


3633 


5419 


784CIP2 61 


6472 


62 


1848 


3634 


5420 


784CIP2 62 


6499 


63 


1849 


3635 


5421 


784CIP2 63 


6499 


64 


1850 


3636 


5422 


784CIP2 64 


6505 


65 


1851 


3637 


5423 


784CIP2 65 


653 4 


66 


1852 


3638 


5424 


784CIP2 66 


6534 


67 


1853 


3639 


5425 


784CIP2 67 


6540 


68 


1854 


3640 


5426 


784CIP2 68 


6550 


69 


1855 


3641 


5427 


784CIP2 69 


6550 


70 


1856 


3642 


5428 


784CIP2 70 


6592 


71 


1857 


3643 


5429 


7A4PTP5 71 

t o** J. r A t j. 




72 


1858 


3 644 


54 3 0 


7P4PTP9 77 




73 


1859 


3645 


543 1 


f oh ^irz / j 


6 7 63 


74 


1860 


3646 


5432 


784CTP9 74. 


b / DO 


75 


1861 


3647 


54 33 


7R4PTP7 7c; 

1 OH \~X V 4. 1 3 


c n 0 c 
b rob 


76 


186"2 


3648 


5434 


7fl4PTpp 7/- 
1 O t! * — L tr (D 


0 0 z% 


77 


1863 


3649 


543 5 


7flifTP? 77 


com 


78 


1864 


3650 


5436 


/ D4LirZ / 0 




79 


1865 


3651 


5437 


7R4CIP5 79 


C R 77 
bo 


80 


1866 


3652 


5438 


7R4CTP7 or\ 


Do Jft 


81 


1867 


3653 


543 9 


784CTP? fll 


6A74 


82 


1868 


3654 


5440 


/ O T^- X XT i. O Z 


DOj j 


83 


1869 


3 655 


5441 






84 


1870 


3656 


5442 


7R4PTP2 R4 




85 


1871 


3657 


5443 






86 


1872 


3658 


5444 


784CIP2 Bfi 


6915 


87 


1873 


3659 


5445 


7R4PT H7 


6932 


89 


1874 


3660 


5446 


784CIP2 8R 


6957 


89 


1875 


3661 


5447 


784CTP9 RQ 


cqci 
D701 


90 


1876 


3662 


5448 






91 


1877 


3663 


5449 


7B4CIP2 91 


6973 


92 


1878 


3664 


5450 


784CiPi 95 


7007 


93 


1879 


3665 


5451 


784CIP2 94 


7 018 


94 


1880 


3666 


5452 


784CIP2 9^ 


7019 


95 


1881 


3667 


54 53 


784CTP2 96 


Tn^n 
/ uzu 


96 


1882 


3668 


5454 


784CIP2 97 


7020 


97 


1883 


3669 


5455 


784CIP2 98 


7021 


98 


1884 


3670 


5456 


7R4TTP7 99 


7023 


99 


1885 


3671 


5457 


784CIP2 100 


7027 


100 


1886 


3672. 


5458 


784CIP2 101 


7028 


101 


1887 


3673 


5459 


784CIP2 102 


7029 


102 


1888 


3674 


5460 


784CIP2 103 


7031 


103 


1889 


3675 


5461 


784CIP2 104 


7032 


104 


1890 


3676 


5462 


784CIP2 105 


7033 


105 


1891 


3677 


5463 


784CIP2 106 


7035 


106 


1892 


3678 


5464 


784CIP2 107 


7036 


107 


1893 


3679 


5465 


784CIP2 108 


7039 


10B 


1894 


3680 


5466 


784CIP2 109 


7043 


109 


1895 


3681 


5467 


784CIP2 110 


7044 


110 


1896 


3682 


5468 


784CIP2 111 


7046 


111 


1897 


3683 


5469 


784CIP2 112 


7054 


112 


1898 


3684 


5470 


784CIP2_113 


7061 


113 


1899 


36B5 


5471 


784CIP2_114 


7077 


114 


1900 


3686 


5472 


784CIP2_115 


7092 


115 


1901 


3687 


5473 


784CIP2 116 


7094 


116 


1902 


3688 


5474 


784CIP2_117 


7106 


117 


1903 


3689 


5475 


784CIP2 118 


7107 


118 


1904 


3690 


5476 


784CIP2 119 


7111 


119 


1905 


3691 


5477 


784CIP2_120 


7123 


120 


1906 


3692 


5478 


784CIP2 121 


7142 


121 


1907 


3693 


5479 


784CIP2_122 


7142 



272 



WO 01/53312 



PCT/USOO/34263 



SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




122 


1908 


3694 


5480 


784CIP2 123 


7154 


123 


1909 


3695 


5481 


784CIP2 124 


7160 


124 


1910 


3696 


5482 


784CIP2 125 


7169 


125 


1911 


3697 


5483 


784C1P2 125 


7185 


126 


1912 


3698 


5484 


784CIP2 127 


7197 


127 


1913 


3699 


5485 


784CIP2 128 


7219 


128 


1914 


'3700 


5486 


784CIP2JL29 


7226 


129 


1915 


3701 


5487 


7B4CIP2_130 


7229 


130 


1916 


3702 


5488 


784CIP2_131 


7234 


131 


1917 


3703 


5489 


784CIP2 132 


7235 


132 


1918 


3704 


5490 


784CIP2 133 


7235 


133 


1919 


3705 


5491 


7B4CIP2 134 


7238 


134 


1920 


3706 


5492 


784CIP2_135 


7247 


135 


1921 


3707 


5493 


784CIP2_136 


7261 


136 


1922 


3708 


5494 


784CIP2_137 


7262 


137 


1923 


3709 


54 95 


784CIP2 138 


7267 


138 


1924 


3710 


5495 


784CIP2 139 


7272 


139 


1925 


3711 


5497 


784CIP2 140 


7273 


140 


1926 


3712 


5498 . 


784CIP2_141 


7282 


141 


1927 


3713 


5499 


784CIP2_142 


7288 


142 


1928 


3714 


5500 


784CIP2_143 


7291 


143 


1929 


3715 


5501 


784CIP2 144 


7293 


144 


1930 


3716 


' 5502 


784CIP2_145 


7294 


145 


1931 


3717 


5503 


784CIP2_146 


7299 


146 


1932 


3718 


5504 


784CIP2_14 7 


7300 


14 7 


1933 


3719 


5505 


784CIP2 148 


7312 


14 8 


1934 


3720 


5506 


784CIP2 149 


7313 


14 9 


1935 


3721 


5507 


784CIP2 150 


7315 


150 


1936 


3722 


5508 


784CIP2_151 


7318 


151 


1937 


3723 


5509 


784CIP2 152 


7321 


152 


1938 


3724 


5510 


784CIP2_153 


7330 


153 


1939 


3725 


5511 


784CIP2 154 


7331 


154 


1940 


3726 


5512 


784CIP2_155 


7333 


155 


1941 


3727 


5513 


784CIP2_156 


7350 


156 


1942 


3728 


5514 


784CIP2_157 


7352 


157 


1943 


3729 


5515 


784CIP2_158 


7384 


158 


1944 


3730 


5516 


784CIP2_159 


7403 


159 


1945 


3731 


5517 


784CIP2_160 


7431 


160 


1946 


3732 


5518 


784CIP2JL61 


7441 


161 


1947 


3733 


5519 


784CIP2JL62 


7453 j 


162 


1948 


3734 


5520 


784CIP2JL63 


7467 


163 


1949 


3735 


5521 


784CIP2_164 


7471 


164 


1950 


373£ 


5522 


784CIP2_165 


7493 


165 


1951 


3737 


5523 


784CIP2_166 


7502 


166 


1952 


3738 


5524 


784CIP2_167 


7511 


167 


1953 


3739 


5525 


784CIP2JL68 


7514 


168 


1954 


3740 


5526 


784CIP2JL69 


7520 


169 


1955 


3741 


5527 


784CIP2_170 


7541 


170 


1956 


3742 


5528 


784CIP2_171 


7570 


171 


1957 


3743 


5529 


784CIP2_172 


7578 


172 


1958 


3744 


5530 


784CIP2 173 


7583 


173 


1959 


3745 


5531 


784CIP2 174 


7592 


174 


1960 


3746 


5532 


784CIP2 175 


7601 


175 


1361 


3747 


5533 


784CIP2 176 


7602 


176 


1962 


3748 


5534 


784CIP2 177 


7508 


177 


1963 


3749 


5535 


784CIP2_178 


7615 


178 


1964 


3750 


5536 


784CIP2_179 


7617 


179 


1965 


3751 


5537 


784CIP2 181 


7624 


180 


1966 


3752 


5538 


784CIP2 182 


762£ 


1B1 


1967 


3753 


5539 


784CIP2 183 


7640 


182 


1968 


3754 


5540 


784CIP2 184 


7641 


183 


1969 


3755 


5541 


784CIP2 185 


7641 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


.SEQ ID 
NO: in 
U.S. S.N. 
09/488, 725 


1 Qi 


1970 


3756 


5542 


784CIP2_186 


7641 


185 


1971 


3757 


5543 


784CIP2 187 


7642 


186 


1972 


3758 


5544 


784CIP2 188 


7649 


187 


1973 


3759 


5545 


7846IP2 189 


7656 


188 


1974 


3760 


5546 


784CIP2 190 


7657 


189 


1975 


3761 


5547 


784CIP2 191 


7657 


190 


1976 


3762 


5548 


784CIP2 192 


7662 


191 


1977 


3763 


5549 


784CIP2 193 


7668 


192 


1978 


3764 


5550 


784CIP2 194 


7673 


193 


1979 


3765 


5551 


784CIP2_195 


7690 


194 


1980 


3766 


5552 


784CIP2 196 


7700 


195 


1981 


3767 


5553 


784CIP2 197 


7709 


196 


1982 


3768 


5554 


784CIP2_198 


7736 


197 


1983 


3769 


5555 


784CIP2 199 


7737 


198 


1984 


3770 


5556 


784CIP2_200 


7744 


199 


1985 


3771 


5557 


784CIP2_201 


7771 


200 


1986 


3772 


5558 


784CIP2 202 


7786 


201 


1987 


3773 


5559 


784CIP2_203 


7791 


202 


198B 


3774 


5560 


784CIP2 204 


7797 


203 


1989 


3775 


5561 


J ;84CIP2 ios 1 


7806 


204 


1990 


3776 


5562 


784CIP2_206 


7812 


205 


1991 


3777 


5563 


784CIP2_207 


7812 H 


206 


1992 


3778 


5564 


7B4CIP2_208 


7818 ] 


207 


1993 


3779 


5565 


7B4CIP2_209 


7822 


208 


1994 


3780 


5566 


784CIP2_210 


7827 


209 


1995 


3781 


5567 


7 84CIP2_211 


7830 


210 


1996 


3782 


5568 


784CIP2_212 


7835 


211 


1997 


3783 


5569 


784CIP2_214 


7840 


212 


1998 


3784 


5570 


784CIP2 215 


7858 


213 


1999 


3785 


5571 


784CIP2 216 


7858 


214 


2000 


3786 


5572 


784CIP2 217 


7861 


215 


2001 


3787 


5573 


784CIP2_218 


7866 


216 


2002 


3788 


5574 


784CIP2 219 


7868 


217 


2003 


3789 


5575 


784CIP2_220 


7896 


218 


2004 


3790 


5576 


784CIP2_221 


7898 


219 


2005 


3791 


5577 


784CIP2_222 


7900 


220 


2006 


3792 


5578 


784CIP2_223 


7906 


221 


2007 


3793 


5579 


784CIP2J224 


7908 


222 


2008 


3 794 


5580 


784CIP2 225 


7909 


223 


2009 


3795 


5581 


784CIP2_226 


7917 


224 


2010 


3796 


5582 


784CIP2_227 


7932 


225 


2011 j 


3797 


5583 


784CIP2 228 


7940 


226 


2012 


3798 


5584 


784CIP2_229 


7940 


227 


2013 


3799 


5585 


784CIP2_230 


7984 


228 


2014 


3800 


5586 


784CIP2 231 


7984 


229 

TTn 


2015 


3801 i 


5587 


784CIP2_232 


8001 


"0 


2016 


3802 


5588 


784CIP2 233 


8021 


231 


2017 


3803 


5589 


784CIP2_234 


8029 


232 


2018 


3804 


5590 


784CIP2_235 


8033 


=v?t 

233 


2019 


3805 


5591 


784CIP2_236 


8040 


234 

oTc 


2020 


3806 


5592 


784CIP2 237 


8052 


235 


2021 


3807 


5593 


784CIP2 238 


8096 


236 


2022 


3 80 8 


5594 


784CIP2 239 


8096 


237 


2023 


3809 


5595 


784CIP2 240 


6113 


238 


2024 


3810 


5596 


784CIP2 241 


8126 ~~ 


239 


2025 


3811 


5597 


784CIP2_242 


8132 


240 


2026 


3812 


5598 


784CIP2J243 


8137 


241 


2027 


3813 


5599 


784ClP2_244 


8137 


242 


2028 


3814 


5600 


784CIP2 245 


8159 


243 


2029 


3815 


5601 


784CIP2_246 


8159 


244 


2030 


3816 


5602 


784CIP2 247 


8161 


245 


2031 


3817 


5603 


784CIP2_248 


B176 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priori tv 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S.S .N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




246 


2032 


3818 


5604 


7 84CIP2_24 9 


8196 


247 


2033 


3819 


5605 


784CIP2_250 


8200 


248 


2034 


3820 


5606 


784CIP2_251 


8212 


249 


2035 


3821 


5607 


784CIP2_252 


8220 


2S0 


2036 


3822 


5608 


784CIP2 253 


8238 


251 


2037 


3823 


5609 


784CIP2 254 


8254 


252 


2038 


3824 


5610 


784CTP2_255 


8255 


253 


2039 


3825 


5611 


784CIP2 256 


8288 


254 


2040 


3826 


5612 


784CIP2 257 


8296 


255 


2041 


3827 


5613 


784CIP2 25B 


8329 


256 


2042 


3828 


5614 


784CIP2 259 


8362 


257 


2043 


3829 


5615 


784CIP2J260 


8429 


258 


2044 


3830 


5616 


784CIP2 261 


8436 


259 


2045 


3831 


5617 


784CIP2_262 


8448 


260 


2046 


3832 


5618 


784CIP2_263 


8472 


261 


2047 


3833 


5619 


784CIP2 264 


8502 


262 


2048 


3834 


5620 


784CIP2_265 


8504 


263 


2049 


3835 


5621 


784CIP2_266 


8507 


264 


2050 


3836 


5622 


784CIP2 268 


8509 


2(55 


2051 


3837 


5623 


784CIP2 269 


8515 


266 


2052 


3838 


5624 


784CIP2 270 


8519 


267 


2053 


3839 


5625 


784CIP2_27l 


8530 


268 


2054 


3840 


5626 


784CIP2J272 


8532 


269 


2055 


3841 


5627 


784CIP2_273 


8532 


270 


2056 


3842 


5628 


784CIP2J274 


8539 


271 


2057 


3843 


5629 


784CIP2_275 


8541 


272 


2058 


3844 


5630 


784CIP2_276 


8543 


273 


2059 


3845 


5631 


784CIP2_277 


8593 


274 


2060 


3846 


5632 


784CIP2J278 


8595 


275 


2061 


3847 


5633 


784CIP2_279 


8615 


276 


2062 


3848 


5634 


784CIP2_280 


8620 


277 


2063 


3849 


5635 


784CIP2_281 


8621 ! 


278 


2064 


3850 




784CIP2_282 


8623 


279 


2065 


3851 


5637 


784CIP2283 


8625 


280 


2066 


3852 


5638 


784CIP2 284 


8628 


281 


2067 


3853 


5639 


784CIP2_285 


8628 


282 


2068 


3854 


5640 


784CIP2_286' 


8629 


283 


2069 


3855 


5641 


784CIP2_2B7 


8630 


284 


2070 


3856 


5642 


784CIP2_288 


8631 


285 


2071 


3857 


5643 


784CIP2 289 


8633 


286 


2072 


3858 


5644 


784CIP2_290 


8634 


287 


2073 


3859 


5645 


784CIP2 291 


8635 


288 


2074 


3860 


5646 


784CIP2 292 


8636 


2B9 


2075 


3861 


5647 


784CIP2_293 


8659 


290 


2076 


3862 


5648 


784CIP2_294 


8660 


291 


2077 


3863 


5649 


784CIP2 295 


B667 


292 


2078 


3864 


5650 


784CIP2_296 


8667 


2 93 


2079 


3865 


5651 


784CIP2_297 


8685 


294 


2080 


3866 


5652 


784CIP2_298 


8805 


295 


2081 


3867 


5653 


784CIP2_299 


8896 


296 


2062 


3868 


5654 


784CIP2_300 


8978 


297 


2083 


3869 


5655 


784CIP2 301 


9046 


298 


2084 


3870 


5656 


784CIP2_302 


9048 


299 


2065' 


3871 


5657 


784CIP2_303 


9116 


300 


20B6 


3872 


5658 


784CIP2_304 


9195 


301 


2087 


3873 


5659 


784CIP2_305 


9201 


302 


2088 


3874 


5660 


784CIP2__306 


9307 


303 


2089 


3875 


5661 


784CIP2 307 


9321 


3 04 


2090 


3876 


5662 


7B4CIP2_308 


9397 


305 


2091 


3877 


5663 


784CIP2_309 


9405 


306 


2092 


3878 


5664 


784CIP2 310 


9406 


307 


2093 


3879 


5^5 


784CIP2 311 


9422 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO : 


docket number_ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length ' 


sequence 


peptide 


SEQ ID NO : in 


09/488, 725 


sequence 


peptide 

Q O /T1 l a n P o 

uequence 




sequence 


priority 

app X X U X Oil 




3 0 8 


o n Q A 


3 8 8 0 


5666 


7R4 prno "J i 7 
/ 0 Jrz__j X Z 


94 94 


*j n q 


nnac 


TQOI 

jDOl 


JDD / 


"7 Q ^pynn Til 


9512 


•Jin 
J J.U 


xuyo 


3 8 82 


566~8 




3bo2 


■j n 
Jll 


2 097 


JOCJ 


b b b y 


"7 Q A r»T T3*"5 TIC 

/b4uiP2_olb 


9661 


3 12 


2 098 


3884 


5670 


/84v_IP2_3 16 


9664 


313 


2 099 


3 885 


56 71 


/B4t-±P2 31/ 


9691 


314 


2100 


3 886 


5 6 72 


/B4LIP2 318 


9700 


1 1 c 


2101 


3 887 


5673 


/B4L.IP2 J19 


97 16 


316 


2102 


3888 


5674 


7B4CIP2_3 20 


9721 


317 


2103 


3889 


5675 


784CIP2 321 


9B70 


318 


2104 


3890 


5676 


784CIP2__322 


98 87 


319 


2105 


3891 


5677 


784CIP2_323 


9923 


320 


2106 


3892 


5678 


784CIP2_324 


9938 


321 


2107 


3893 


5679 


784CIP2_325 


9964 


322 


2108 


3 894 


CC Art 

5680 


784CIP2__326 


10007 


323 


2109 


3895 


5681 


784CIP2_327 


10009 


324 


2110 


3896 


5682 


784CIP2_328 


10046 


325 


2111 


3 897 


5683 


784CIP2__329 


10156 


326 


2112 


3898 


5684 


784CIP2_330 


10276 


^ n - 

327 


2113 


3899 


5685 


784CIP2_331 


10283 


32 8 


2114 


3900 


5686 


784CIP2B_1 


152 


329 


2115 


3901 


5687 


784CIP2B_2 


167 


330 


2116 


3902 


5688 


784CIP2B_3 


205 


331 


2117 


3 903 


5689 


784CIP2B 4 


210 


3 32 


2118 


3 904 


5690 


784CIP2B_5 


225 


333 


2119 


3905 


5691 


784CIP2B_6 


226 


334 


2120 


3906 


5692 


7 84CIP2B_/7 


264 


335 


2121 


3 907 


5693 


784CIP2B 8 


268 


336 


2 122 


3 908 


5694 


784CIP2B 9 


293 


3 37 


2123 


3 909 


S695 


784CIP2B_10 


293 


338 


2124 


3910 


5696 


784CIP2B_11 


2 93 


339 


2125 


3 911 


5697 


784CIP2B__12 


3 02 


nT>i r\ 

340 


2126 


3 912 


5698 


784CIP2B_13 


311 


341 


2127 


3 913 


5699 


784CIP2B_14 


3 52 


342 


2128 


3914 


5700 


784CIP2B 15 


358 


343 


2129 


3915 


5701 


764CIP2B_16 


368 


344 


2130 


3916 


5702 


784CIP2B_17 


3 93 


345 


2131 


3917 


5703 


784CIP2B_18 


477 


1 A C 

i4o 


2132 


3918 


5704 


784CIP2B_19 


508 


347 


2133 


3 919 


5705 


784CIP2B_20 


508 


348 


2134 


3 920 


5706 


784CIP2B_21 


515 


349 


213 5 


3 921 


b / u / 


/B4L.1P2B 22 


578 


-JbU 


2136 


3 922 


5708 


764CIP2B_23 [ 


588 


JO J. 


2137 


3 923 


5709 


784CIP2B_24 


591 




213 8 


3 924 


5710 


/o4t_lP2o_2b 


593 


J _> J> 


2139 


■JOOC 


b /XX 


/ o4CXP^i3__2 6 


594 


3 54 


*L 1 *i U 


~>QOC "* 
5 y Z b 


b /xz 


/ B4v LP^b ^ / 


619 


let 
J DO 


01/11 


1QT7 


b / Xj 


IPAfTDTD no 

/o4v_1PZd__2o 




3 56 




"2 Q "7 Q 


C 7 1 A 
b / 14 




b 54 


J J / 


2143 


i a o q 


5715 


/84LIP2B 30 


692 


ICQ 
O DO 


O 1 A A 


lain 
J y 3 U 


57 16 


784CIP2B_31 


753 


359 


2145 


3931 


5717 




758 


360 


2146 


3932 


5718 


784CIP2B_33 


787 


361 


2147 


3933 


5719 


784CIP2B_34 


833 


362 


2148 


3934 


5720 


7B4CIP2B_35 


838 


363 


2149 


3935 


5721 


7B4CIP2B_36 


870 


364 


2150 


3936 


5722 


7B4CIP2B_37 


891 


365 


2151 


3937 


5723 


784CIP2B_38 


891 


3 66 


2152 


3938 


5724 


784CIP2B_39 


921 


367 


2153 


3939 


5725 


784CIP2B 40 


924 


368 


2154 


3940 


5726 


784CIP2B_41 


932 


369 


2155 


3941 


5727 


784CIP2B_42 


942 
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SEQ ID NO: 
of full- 


SEQ ID 
NO: of 


SEQ ID NO: 
of contig 


SBQ ID 
NO : 


Priority 
docket number_ 


SEQ ID 
NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


TT e C M 


nucleotide 


length 


sequence 


peptide 


0 17 O XVl MP* . i ri 

okij iu r*u . in 


na/AQQ TIC 


sequence 


peptide 
sequence 




ScCjUBIlCc 


priurxLy 
appl i ca t ion 




O 1 V 




3942 


5728 


784CIP2B 43 


958 


"< 7 1 




3 943 


572 9 


784CIP2B 44 


968 


■37O 
J l£ 


■11 CO 

61j0 i 


3 944 


573 0 


784CIP2B 45 


992 




" 7 1 CO. " 
*15j 


3 945 


5731 


784CIP2B 46 


1025 


T Til 
J /4 


7 1 c n 

<i lb U | 


3946 


5732 


784CIP2B 47 


1074 




6lOl ; 


3947 


5733 


784CIP2B 48 


1104 


J /o 


7 1 c? 

61B6 


3 94 8 


5734 


784CIP2B 49 


1114 


3 77 




3949 


£735 


784CIP2B 50 


1144 


378 


7 1 £A 
<1q4 


3 950 


5736 


784CIP2B 51 


1262 


3 79 


2 165 




573 7 


7R4PTP9R 5? 


1318 


3 80 


2 166 


7 Q G O 

JJ36 


573 8 




1319 


3 81 


2167 


J 33J 


CT3Q 




132 8 


3 82 


2168 


3954 


CO a n 




1436 


383 


2169 


3955 


5741 




11D 


384 


2170 


3956 




7P/CT DTD t; 7 
/ 0 *» V-X Jtr^£3__3 / 


1584 


385 


2171 


3957 


C74i 


7fliTTP5R C ft 


1617 


386 


2172 


3958 




■Jfl/PTD313 c q 
/o*Llr£D O if 


1724 


387 


2173 


3959 


5745 


DOTS C(l 


172 8 


388 


2174 


3 960 


574 6 


TflAPT DID Cl 
/ O 4 ^ 1 ¥ZE> O X 


1 OOO 


389 


2175 


3961 


574 7 


/o4Ll tr2a b J. 


ic.no. 

lO U J 


390 


2176 


3962 


574 8 


70iir" , TD7T3 C7 


1868 


391 


2177 


3 963 


C 7 A Q 


1 O t k\-.Lkr 4.D o't 


189 8 


392 


2178 


3 964 


i> /ou 


/04k-lJr£l3 Dj 




393 


2179 


3965 


3/31 




1965 


394 


2180 


3 966 


5752 


OQAf^T DID £ 7 
/ 04 l»l f4b © / 


1967 


395 


2181 


3967 


5753 


/□4L.1JT6I3 DO 


1995 


396 


2182 


3 968 


5754 


/ 0*t 1. X r ^ £3 


2005 


397 


2183 


3 969 


D /33 


TfldrTPIB 7fi 


2027 


398 


2184 


3 970 


5756 


/O*al_Xlr£0 /J. 


20 S3 


3 99 


2185 


J J fx 


5757 


7fl4f , TP7R 79 


2103 


400 


2186 




5758 


7P.4PTP7R 73 


2106 


401 


2187 


1 Q7 1 


5759 


7A4PTP7R 74 


2166 


402 


2188 


■j an/ 
/4 


5760 


7R4PTP2B 7^ 


2175 


403 


"-> too 




5 761 


7S4CIP2B 76 


2176 


404 


2190 


J J /o 


5762 


784CIP2B 78 


2236 


4 05 


2191 


"3 QT7 
J J / / 




7fldrTP?R 79 


2250 


406 


2192 


3 978 


5764 


7P.4CTP7R 80 


2300 


407 


2193 


jy /y 


c;76e: 


7fiirTP5R fll 


2323 


408 


2 1 y 4 


1 6 a n 


5766 


784CTP2B 82 


2340 


409 


2195 




5767 


7ft4r*TPS>R fll 


2371 


410 


2196 


j y o a 


5768 


784rTP?R B4 


2399 


411 


2197 


J JO J 


5769 


784f , TP2R B5 


2411 


412 


2198 


J Jul 


5770 


7ft4rTP?R 86 


2428 


413 


n on 

2 iy y 


o y o 3 


5771 


7fl4r , TP9R R7 


2430 


ATA 
4 14 


77 nn 
z «i u u 




5772 


784CIP2B 88 


2439 


Alt 

41o 


Z 2\J X 


39 87 


5773 


784CIP2B 89 


2447 


1 AO 


2202 


3988 


5774 


784CIP2B 90 


2461 


4 17 


oo m 


3 9 89 


5775 


7S4CIP2B 91 


2487 i 


418 




"3 q on 
j y y u 


5776 


7R4riP2B 92 


2492 


iia 
41? 






5777 


784CIP2B 93 


2512 


420 


2 2 U b 


TOO? 

j y y 4 


"' c;77fl 

3 / / 0 


7ft4PTP7R Q4 


2564 


421 


2207 


3993 


5779 


784CIP2B 95 


2678 


422 


2208 


3994 


5780 


784CIP2B_96 


2816 


423 


2209 


3995 


5781 


784CIP2B_97 


2818 


424 


2210 


3996 


5782 


784CIP2B_98 


2819 


425 


2211 


3997 


5783 


784CIP2B_99 


2943 


426 


2212 


3938 


5784 


784CIP2B_100 


3137 


427 


2213 


3999 


5785 


784CIP2B_101 


3137 " 


428 


2214 


4000 


5786 


784CIP2B_102 


3160 


429 


2215 


4001 


5787 


784CIP2B_103 


3323 


430 


2216 


4002 


5788 


784CIP2B_104 


3360 


431 


2217 


4003 


5789 


784CIP2B 105 


3362 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO : of 


of contig 


NO : 


docket number_ 


NO : in 


length 


full- 


nucleotide 


cf contig 


corresponding 


U. S . S .N . 


nucleotide 


length 


secfiience 


peptiue 


otv J.JJ . xn 


fl Q I A ft fi TTC 
V3/4BO, /ZD 


sequence 


peptide 




sCljUCilLC 


applica t ion 






2218 


4004 


5790 


784CIP2B 106 


3417 


43 3 


2219 


4005 


57 91 


784CIP2B 107 


3418 


A 7 A 

Hot 




40 06 


57 92 


784f , TP2B 108 


3442 


a t c. 


d do 1 


4007 


5793 


784CIP2B 109 


3442 


43 6 


2222 


40 08 1 


5794 


784CIP2B 110 


3444 


/ 


D *3 D 7 


4 C 0 9 


5795 


784CIP2R 111 


3 855 


H J O 


ZZ Z4 


4 010 


57 96 


7R4rTP2R 11? 


3 863 


439 


2225 


4011 


5797 


7RAr , TP7R 11 ■> 


4 0'90 


440 


ZZZ O 


4 u xz 


5798 


7R APTP^d. 114 


4105 


44 x 


2227 




J/J? 


7RArTP?n 1 1 c 


414 2 


442 


2228 


4 014 


conn 


TQAfTD'Jn 11C 
/O^LlriD 11D 


A 1 A *5 
\ X*iZ 


443 


222 9 


4015 


58 01 


/ G4l_ J. rZe XX/ 


A 1 A Q 
** X» ^ 


444 


223 0 


4016 


5802 


/04<— IJrZn XXb 


1 X i7b 


445 


2231 


4017 


com 


TQ^rTDOTJ HQ 


42 02 


446 


2232 


4 018 


DO U4 


1 0 X r Z 1ZU 


A D T d 


447 


2233 


4019 


cone 

DO U D 


/04*-,lrZD XZJL 


4304 


44 8 


2234 


A ft"} A 
4 UZ U 


?TP f! 2 

D 0 U b 


— TRA^P?"!* 197 
/oQLlr/fl XZZ 


a i n c 


449 


2 235 


4 021 


58 07 


/ 0 4 \, X r Z p X Z J 


A 11 1 
1 -i XX 


45 0 


2236 


4 02 2 


cono 

DO U O 


TflAPTPOP. 1 OA 
/OSLlrii) XZ4 


43 21 


451 


2237 


4 023 


5809 


/O^LlrA 0 X Z D 


A T *5 "3 


452 


22 38 


4 024 


coin 
DO lu 


7QAPTDOT3 1 0£ 
/ O 4 L. J. i^Z 0 XZb 


A 1 1 5 
4 J J Z 


453 


223 9 


4025 


CQ1 1 


IQ.AC'TDOZX 1 OT 


/J OR 


454 


2240 


A f\">C 
4 UZO 


DO XZ 


/ O^LlrAD XZ 0 


a R fl n 

H D O O 


455 


2241 


4027 


5813 


/tULlrZH XZj 


eccQ 


456 


2242 


4Uzo 




/o4Llr^o XJU 


33 (J 


A CI 


2243 


4029 


DO -LD 


/04V~i..rZO xjx 


5577 


458 


2244 


403 0 


DO -Lb 


/o^Llr^a XJZ 


DD / i? 


459 


2245 


403 1 


5817 


TBATTOn ITT 
/ 04Llrza UJ 


c; c: q -j 


460 


2246 


403 2 


5818 


f 0 4*~1 r Zo XD4 




461 


2247 


4033 


cm q 
do x y 


7 RAPTPOR 1 m, 


5d~8~4" 


462 


2248 


4034 


coon 

DO Z U 


/04V-.X*rZD XJd 


D D O D 


A CI 


*} O A Q 

z Z4 y 


a m 


5821 


7fldrTP5R 117 
/ DILlr^D IJ y 


5591 


464 


2250 


a m c 


DOZ Z 


"7 R APTPDR no 
/ 04LlfZD XDO 


5593 


465 


2251 


403 7 


con T 
DOZ 3 


PTDTn 1 "1 Q 
/ OHL.J.k'^D XJi? 


C C QA 
D D ?4 


466 




a m □ 
4Uj 0 


DOZ4 


7RAPTP7P 1AH 
/04V>XirZD 


c c q a 

D D 3* 


A CI 

4b / 


Z Z D J 


4 U J y 


5825 


/ g^^lr^D 111 


5598 


468 


2254 


4 040 


DOZ O 


7fldrTP?U 1 A-J 
f 0 4V_XirZi5 X*iZ 


DO UZ 


469 


Z Z O b 




DOZ / 


TflAPTPOU 14*5 


5^05 


a ir\ 
4 /u 


zibb 


* U4Z 


i 582 8 


7R4r , TP'>P. 144 


5608 


A *71 
4 /J. 


22 57 


4 04 3 


582 9 


7fl4rTP2R 14^ 


5617 




ZZ DO 


A n A A 


583 0 


7R4rTP?R 14fi 


5620 


4 / J 


/ZD? 


4 U4 D 


5831 


"7RAPTP7H 1 A T 


" 5622 


4 *7A 
4 /4 


-57cri 

Z Z b U 


4046 


[ 5832 


7fl4rTP2R 14fl 


5623 




2 2 61 


4 047 


5833 


7R4CIP2B 149 


5624 


4 76 


2262 


4 048 


5834 


784CIP2B 150 


5625 


477 


22 63 


4 049 


5835 


784CIP2B 151 


5627 


4 76 


2264 


4050 


5836 


784CIP2B 152 


5628 


479 


2265 


4051 


5837 


784CIP2B 153 


5630 


480 


2266 


4052 


583 8 


784CIP2B 154 


5632 


4 fll 


22 67 




583 9 


7R4riP2R i^t; 


5640 


4 OZ 


z z 0 0 


A nc,4 


5840 


794TTP2R 1 ^fi 

/ o**^_xrxD i^>D 


5641 


483 


2269 


4 055 


5841 


784CIP2B 157 


5643 


484 


2270 


4056 


5842 


784CIP2B_158 


5647 


485 


2271 


4057 


5843 


784CIP2B_159 


5649 


486 


2272 


4058 


5844 


784CIP2B_160 


5658 


487 


2273 


4059 


5845 


784CIP2B_161 


5659 


488 


2274 


4060 


5846 


784CIP2B_162 


5667 


469 


2275 


4061 


5847 


784CIP2B_163 


5672 


490 


2276 


4062 


5848 


784CIP2B_164 


5674 


j491 


2277 


4063 


5849 


784CIP2B 165 


5678 


492 


2278 


4064 


5850 


784CIP2B_166 


5680 


493 


2279 


4065 


5B51 


784CIP2BJL67 


5684 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


494 


2280 


4066 


5852 


784CIP2B 168 


5686 


495 


2281 


4067 


5853 


784CIP2B 169 


5694 


496 


2282 


4068 


5854 


784CIP2B_170 


5698 


497 


2283 


4069 


5855 


784CIP2B 171 


5699 


498 


2284 


4070 


5856 


784CIP2B 172 


5712 


499 


2285 


4071 


5857 


784CIP2B 173 


5719 


500 


2286 


4072 


5858 


784CIP2B 174 


5720 


501 


2287 


4073 


5859 


784CIP2B 175 


5727 


502 


2288 


4074 


5860 


784CIP2B 176 


5730 


503 


2289 


4075 


5861 


784CIP2B 177 


5734 


504 


2290 


4076 


5862 


784CIP2B 178 


5738 


505 


2291 


4077 


' 5863 


784CIP2B 179 " 


5739 


506 


2292 


4078 


5864 


7B4CIP2B 180 


5740 


507 


2293 


4079 


5665 


784CIP2B 181 


5744 


508 


2294 


4080 


5866 


784CIP2B 182 


5748 


509 


2295 


4081 


5867 


784CIP2B 183 


5749 


510 


2296 


4082 


5868 


784CIP2B 184 


5750 


511 


2297 


4083 


5869 


784CIP2B 185 


5750 


512 


2298 


4084 


5870 


7B4CIP2B_186 " 


5750 


513 


2299 


4085 


5871 


784CIP2B 187 


5761 


514 


2300 


4086 


5872 


784CIP2B 188 


5762 


515 


2301 


4087 


5873 


784CIP2B 189 


5767 


516 


2302 


4088 


5874 


784CIP2B 190 


5773 


517 


2303 


4089 


5875 


784CIP2B 191 


5783 


518 


2304 


4090 


5876 


784CIP2B 192 


5784 


519 


2305 


4091 


5877 


784CIP2B 193 


5788 


520 


2306 


4092 


5878 


784CIP2B 194 


5798 


521 


2307 


4093 


5879 


784CIP2B_196 " 


5807 


522 


2308 


4094 


5880 


784CIP2B 197 


5818 


523 


2309 


4095 


5881 


784CIP2B 198 


5819 


524 


2310 


4096 


5882 


784CIP2B_199 


5827 


525 


2311 


4097 


5883 


784CIP2B_200 


5828 


526 


2312 


4098 


5884 


784CIP2B 201 


5842 


527 


2313 


4099 


5885 


784CIP2B 202 


5853 


528 


2314 


4100 


5886 


784CIP2B_203 " 


5861 


529 


2315 


4101 


5887 


784CIP2B_204 


5864 


530 


2316 


4102 


5888 


784CIP2B 205 


5865 


531 


2317 


4103 


5889 


784CIP2B 206 


5871 


532 


2318 


4104 


5890 


784CIP2B 207 " 


5873 


533 


2319 


4105 


5891 


784CIP2BJ208 


5873 


534 


2320 


4106 


5892 


784CIP2B_209 


5875 


535 


2321 


4107 


5893 


784CIP2B 210 


5878 


536 


2322 


4108 


5894 


784CIP2B 211 


5879 


537 


2323 


4109 


5895 


784CTP2B_212 


5880 


538 


2324 


4110 


5896 


784CIP2B 213 


5880 


539 


2325 


4111 


5897 


784CIP2B_214 


5880 


540 


2326 


4112 


5898 


784CIP2B 215 


5880 


541 


2327 


4113 


5899 


784CIP2B_216 


5885 


542 


2328 


4114 


5900 


784CIP2B_217 


5895 


543 


2329 


4115 


5901 


784CIP2B 218 


5898 


544 

r a i- 


2330 


4116 


5902 


784CIP2B_219 


5902 


545 


2331 


4117 


5903 


784CIP2B 220 


5904 


546 


2332 


4118 


5904 


784CIP2B 221 


5918 


547 


2333 


4119 


5905 


784CIP2B 222 


5921 


548 


2334 


4120 


5906 


784CIP2B 223 


5927 


549 


2335 


4121 


5907 


784CIP2B_224 


5932 


550 


2336 


4122 


5908 


7B4CIP2B 225 


5939 


551 


2337 


4123 


5909 


784CIP2B 226 


5945' 


552 


2338 


4124 


5910 


784CIP2B 227 


5946 


5*3 


2339 


4125 


5911 


784CIP2B 228 


5947 


554 


2340 


4126 


5912 


784CIP2B 229 


5956 


555 


2341 


4127 


5913 


784CIP2B 230 


5967 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket nuraber_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488, 725 


556 


2342 


4128 


5914 


784CIP2o__232 


5975 


557 


2343 


4129 


5915 


7 84CIP2B__23J 


coin 


558 


2344 


4130 


5916 


784CIP2B_234 


5978 


559 


2345 


4131 


5917 


784CIP2B^235 


5979 


560 


2346 


4132 


5918 


784CIP2B__236 


5980 


561 


2347 


4133 


5919 


784CIP2B__237 


598 8 


562 


234B 


4134 


5920 


784CIP2B_238 


5989 


563 


2349 


4135 


5921 


784CIP2B_239 


5991 


564 


2350 


4136 


5922 


784CIP2B__240 


5997 


565 


2351 


4137 


5923 


784CIP2B 241 


5998 


566 


2352 


4138 


5924 


784CIP2B_242 


6003 


567 


2353 


4139 


5925 


784CIP2B_243 


6004 


568 


2354 


4140 


5926 


784CIP2B_244 


6013 


569 


2355 


4141 


5927 


784CIP2B_245 


6028 


570 


2356 


4142 


5928 


784CIP2B_246 


6028 


571 


2357 


4143 


5929 


784CIP2B 247 


6029 


572 


2358 


4144 


5930 


784CIP2B_248 


6031 


573 


2359 


4145 


5931 


784CIP2B__249 


6031 


574 


2360 


4146 


5932 


784CIP2B_250 


6032 


575 


2361 


4147 


5933 


784CIP2B_251 


6037 


576 


2362 


4148 


5934 


784CIP2B_252 


6037 


577 


2363 


4149 


5935 


784CIP2B_253 


6043 


576 


2364 


4150 


5936 


784CIP2B 254 


6044 


579 


2365 


4151 


5937 


784CIP2B_255 


6046 


580 


2366 


4152 


5938 


784CIP2B > _256 


6048 


581 


2367 


4153 


5939 


784CIP2B 257 


6049 


582 


2368 


4154 


5940 


784CIP2B_258 


6051 


583 


2369 


4155 


5941 


784CIP2B_259 


6053 


584 


2370 


4156 


5942 


784CIP2B_260 


6060 


585 


2371 


4157 


5943 


784CIP2B_261 


6063 


586 


2372 


415B 


5944 


784CIP2B_26"2 


6066 


587 


2373 


4159 


5945 


784CIP2B_263 


6067 


588 


2374 


4160 


5946 


784CIP2B264 


6068 


5B9 


2375 


4161 


5947 


784CIP2B_265 


6073 


590 


2376 


4162 


5948 


784CIP2B_266 


6076 


591 


2377 


4163 


5949 


784CIP2B 267 


6076 


592 


2378 


4164 


5950 


784CIP2B_26B 


\ 6077 


593 


2379 


4165 


5951 


784CIP2B_269 


6079 


594 


2380 


4166 


5952 


784CIP2B_270 


6082 

' £ na a 


595 


2381 


4167 


5953 


784CIP2B 272 


6088 


596 


2382 


4168 


5954 


784CIP2B_273 


6091 


597 


2383 


4169 


5955 


784CIP2B_274 


6094 


598 


2384 


4170 


5956 


784CIP2B_275 


6101 


599 


2385 


4171 


5957 


784CIP2B_276 


6103 


*oo ■ 


2386 


4172 


5958 


784CIP2B 277 


6104 


601 


2387 


4173 


5959 


784CIP2B_278 


6108 


602 


2388 


4174 


5960 


784CIP2B_279 


6112 


603 


2389 


4175 


5961 


784CIP2B_280 


6121 


£04 


2390 


4176 


5962 


784CIP2B_281 


6125 


605 


2391 


4177 


5963 


784CIP2B_282 


DlZD 


606 


2392 


4178 


5964 


784CIP2B 283 


6128 


607 


23 93 


4179 




784CIP2B_284 


6129 


608 


2394 


4180 


5966 


784CIP2B 285 


6133 


609 


2395 


4181 


5967 


784CIP2B_286 


6133 


610 


2396 


4182 


5968 


784CIP2B_287 


6135 


611 


2397 


4183 


5969 


784CIP2B_288 


6139 


612 


2398 


4184 


5970 


784CIP2B 289 


6141 


613 


2399 


4185 


5971 


784CIP2B_290 


6145 


614 


2400 


4186 


5972 


784CIP2B 291 


6146 


615 


2401 


4187 


5973 


784CIP2B 292 


6148 


616 


2402 


4188 


5974 


784CIP2B_293 


6149 


617 


2403 


4189 


5975 


784CIP2B 294 bl'kif 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority- 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488, 725 


618 


2404 


4190 


5976 


784CIP2B 295 


6153 


619 


; 2405 


4191 


5977 


784CIP2B_296 


6159 


62G 


2406 


4192 


5978 


784CIP2B 297 


6164 


Pvi 

621 


2407 


I 4193 


5979 


784CIP2B 298 


6167 


622 


2408 


4194 


5980 


784CIP2B_299 


6172 


623 


2409 


4195 


5981 


784CIP2B_300 


6173 


624 


2410 


4196 


5982 


784CIP2B_301 


6190 


625 


2411 


4197 


5983 


784CIP2B 302 


6194 


626 


2412 


4198 


5984 


784CIP2B 303 


6196 


627 


2413 


4199 


5985 


784CIP2B 304 


6197 


628 


2414 


4200 


£986 


784CIP2B 305 


6198 


629 


2415 


4201 


5987 


784CIP2B_306 


6198 


630 


2416 


4202 


5988 


784CIP2B 308 


6214 


631 


2417 


4203 


5989 


784CIP2B 309 


6215 


632 


2418 


4204 


5990 


784CIP2B 310 


6219 


633 


2419 


4205 


5991 


784CIP2B_311 


6226 


634 


2420 


4206 


5992 


784CIP2B_312 


6229 


635 


2421 


4207 


5993 


784CIP2B_313 


6234 


636 


2422 


4208 


5994 


784CIP2B 314 


6237 


637 


2423 


4209 


5995 


784CIP2BJJ15 


6238 


638 


2424 


4210 


5996 


704CIP2B 316 


6239 


639 


2425 


4211 


5997 


784CIP2B 317 


6239 


640 


2426 


4212 


5998 


784CIP2B 318 


6239 


641 


2427 


4213 


5999 


784CIP2B 319 


6240 


642 


2428 


4214 


6000 


784CIP2B 320 


6244 


643 


2429 


4215 


6001 


784CIP2B 321 


6245 


644 


2430 


4216 


6002 


784CIP2B 322 


6250 


645 


2431 


4217 


6003 


784CIP2B 323 


6252 


646 


2432 


4218 


6004 


784CIP2B 324 


6252 


647 


2433 


4219 


6005 


784CIP2B 325 


6256 


648 


2434 


4220 


6006 


784CIP2B 326 


6260 


649 


2435 


4221 


6007 


784CIP2B 327 


6261 


650 


2436 


4222 


£668 


784CIP2B_328 


6" 2 £4 


651 


2437 


4223 


6009 


784CIP2B 329 


6265 


652 


2438 


4224 


6010 


784CIP2B_330 


6266 


653 


2439 


4225 


6011 


784CIP2B 331 


6270 


654 


2440 


4226 


6012 


784CIP2B 332 


6271 


655 


2441 


4227 


6013 


784.CIP2B 334 


6274 


656 


2442 


4228 


6014 


784CIP2B 335 


6276 


657 


2443 


4229 


6015 


784CIP2B_336 


6281 


658 


2444 


4230 


6016 


784CIP2B_337 


6281 


659 


2445 


4231 


6017 


784CIP2B 338 


6288 


660 


2446 


4232 


6018 


784CIP2B 339 


6292 


661 


2447 


4233 


6019 


784CIP2B 340 


6294 


662 


2448 


4234 


6020 


784CIP2B 343 


6312 


663 


2449 


4235 


6021 


784CIP2B 344 


S312 ~~ 


664 


2450 


4236 


6022 


784CIP2B 345 


6312 1 


665 


2451 


4237 


6023 


784CIP2B 346 


6322 


666 


2452 


4238 


6024 


784CIP2B 347 


6324 


667 


2453 


4239 


6025 


784CIP2B 349 


6329 


668 


2454 


4240 


6026 


784CIP2B_350 


6331 


669 


2455 


4241 


6027 


784CIP2B 351 


6333 


670 


2456 


4242 


6028 


784CIP2B 352 


6334 


671 


2457 


4243 


6029 


784CIP2B 353 


6337 


672 


2458 


4244 


6030 


784CIP2B 354 


6339 


673 


2459 


4245 


6031 


784CIP2B 355 


6346 


674 


2460 


4246 


6032 


784CIP2B 356 


634B 


675 


2461 


4247 


6"033 


784CIP2B_357 


6348 


676 


2462 


4248 


6034 


784CIP2B 358 


6350 


677 


2463 


4249 


6035 


784CIP2B 359 


6351 


678 


2464 


4250 


6036 


784CIP2B 360 


6355 


679 


2465 


4251 


6037 


784CIP2B 361 


6362 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corre sponding 
SEQ ID NO: in 
priority- 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488, 725 


6 8 0 


2466 


4252 


6038 


784CIP2B_3 62 


6368 


OOl 


2467 


4253 


6039 


784CIP2B 363 


6369 


682 


~> A C Q 


4254 


6040 


784CIP2B 364 


6371 


683 


*> A C Q 
<£4o2 


' — TTcTc 

4 2b b 


6041 


784CIP2B 365 


6376 


684 " 


2470 


4256 


6042 


764CIP2B_3 66 


6379 


685 


2471 


4257 


6043 


7B4CIP2B_3 67 


6380 


CQC 
DOD 


2472 


4258 


6044 


7B4CIP2B 368 


6381 


6 B7 


2473 


4259 


6045 


784CIP2B 369 


6392 


688 


2474 


4260 


6046 


784CIP2B 370 


6395 


6 8 9 


2475 


4261 


6047 


784CIP2B 371 


6397 


690 


2476 


4262 


s 6048 


784CIP2B 372 


6400 


691 


2477 


4263 


6049 


784CIP2B 373 


6401 


692 


2478 


4264 


6050 


784CIP2B_374 


6411 


693 


2479 


4265 


6051 


784CIP2B_375 


6411 


694 


2480 


4266 


6052 


784CIP2B_376 


6411 


695 


2481 


4267 


6053 


784CIP2B_377 


6416 


696 


2482 


4268 


6054 


784CIP2B_378 


6418 


697 


2483 


4269 


6055 


784CIP2B_379 


6422 


698 


2484 


4270 


6056- 


784CIP2B 380 


6423 


699 


2485 


4271 


6057 


784CIP2B 381 


6426 


700 


2486 


4272 


6058 


784CIP2B 382 


6427 


701 


2487 


4273 


6059 


784CIP2B 383 


64 28 


702 


2488 


4274 ' 


6060 


784CIP2B 384 


6429 


703 


2489 


4275 


6061 


784CIP2B 385 


6430 


704 


2490 


4276 


6062 


784CIP2B 386 


6432 


Writ 

705 


2491 


4277 


6063 


784CIP2B 387 


6432 


706 


2492 


4278 


6064 


784CIP2B 388 


6438 


707 


2493 


4279 


6065 


784CIP2B 389 


6441 


708 


2494 


4280 


6066 


784CIP2B 390 


6446 


709 


2 495 


4281 


6067 


7B4CIP2B 391 


6454 


/ lu 


2496 


4282 


6068 


784CIP2B_392 


6459 


f 11 


24 97 


4283 


6069 


784CIP2B_394 


6461 


712 


2498 


4284 


6070 


784CIP2B_395 


64 6> 


713 


2499 


4285 


6071 


784CIP2B 396 


6468 


714 


2500 


4286 


6072 


784CIP2B_397 


6487 


715 


2501 


4287 


6073 


784CIP2B_398 


6491 


716 


2502 


4288 


6074 


784CIP2B_399 


6506 


717 


2503 


4289 


6075 


784CIP2B 401 


6514 


718 


2504 


4290 


6076 


7B4CIP2B 402 


6519 


719 


2505 


4291 


6077 


784CIP2B_403 


6521 


720 


2506 


4292 


6078 


784CIP2B__404 


6532 


721 


2507 


4293 


6079 


784CIP2B_405 


6536 


722 


250 8 


4294 


6080 


784CIP2B_406 


6543 


70*5 


2509 


4295 


6081 


784CIP2B_407 


6544 




2510 


4296 


6082 


784CIP2B 408 


654 8 


725 


2511 


4297 


6083 


784CIP2B_409 


6551 


726 


■icn 


4298 


6084 


784CIP2B 410 


6551 


727 


Ten 


4299 


6085 


784CIP2B_411 


6552 


728 


^3 14 


4300 


6086 


784CIP2B 412 


6554 




2515 


4301 


6087 


784CIP2B_413 


6556 


730 


zblb 


4302 


6088 


784CIP2B 414 


6560 


711 

/ J JL 


%n 


4303 


6089 


784CIP2B_415 


6563 


732 


2518 


4304 


Ow7U 


J 84CIP2B_4 16 


6564 


733 


2519 


4305 


6091 


784CIP2B_417 


6567 


734 


2520 


4306 


6092 


784CIP2B_418 


6573 


735 


2521 


4307 


6093 


784CIP2B_419 


6575 


736 


2522 


4308 


6094 


784CIP2B_420 


6577 


737 


2523 


4309 


6095 


784CIP2B 421 


6593 | 


738 


2524 


4310 


6*096 


784CIP2E_422 


6595 


73$ 


2525 


4311 


6097 


784CIP2B_423 


6599 


740 


2526 


4312 


6098 


784CIP2B 424 


662S 


741 


2527 """ 


4313 


6099 


784CIP2B 425 


662S 
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SEO ID NO • 


SEQ ID 


Priori ty 


SEQ ID 


of full- 


NO : Of 


of contig 


NO : 


docket number 


NO: in 


leng th 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority . 
application 




742 


2528 


4314 


6100 


784CIP2B_426 


6626 


743 


2529 


4315 


6101 


784CIP2B_427 


6630 


744 


253C 


4316 


6102 


784CIP2B_428 


6631 


745 


2531 


4317 


6103 


784CIP2B_42 9 


«32 


746 


2532 


4318 


6104 


784CIP2B 430 


6633 


747 


2533 


4319 


6105 


784CIP2B_431 


6634 


748 


2534 


4320 


6106 


784CIP2B_432 


6638 


749 


2535 


4321 


6107 


784CIP2B_433 


6641 


750 


2536 


4322 


6108 


784CIP2B_434 


6644 


751 


2537 


4323 


6109 


784CIP2B435 


6646 


752 


2538 


4324 


6110 


784CIP2B_436 


6648 


753 


2539 


4325 


6111 


784CIP2B_437 


6652 


754 


2540 


4326" 


6112 


784CIP2B 438 


6654 


755 


2541 


4327 


6113 


784CIP2B_439 


6657 


756 


2542 


4328 


6114 


784CIP2B_440 


6658 


757 


2543 


4329 


6115 


784CIP2B_441 


6663 


758 


2544 


4330 


6116 


784CIP2B_442 


6664 


759 


2545 


4331 


6117 


784CIP2B_443 


6668 


760 


2546 


4332 


6118 


784CIP2B_444 


6669 


761 


2547 


4333 


6119 


784CIP2B_445 


6673 


762 


2548 


4334 


6120 


784CIP2B_446 


6685 


763 


2549 


4335 


6121 


784CIP2B_447 


6687 


764 


2550 


4336 


6122 


784CIP2B448 


6689 


765 


2551 


4337 


6123 


784CIP2B_449 


6693 


766 


2552 


4338 


6124 


784CIP2B_450 


6698 


767 


2553 


4339 


6125 


784CIP2B_451 


6699 


768 


2554 


4340 


6126 


784CIP2B 452 


6705 


769 


2555 


4341 


6127 


784CIP2B_453 


6711 


770 


2556 


4342 


6128 


784CIP2B_454 


6713 


771 


2557 


4343 


6129 


784CIP2B_455 


6716 


772 


255B 


4344 


6130 


784CIP2B_456 


6725 


773 


2559 


4345 


6131 


784CIP2B_457 


6726 


774 


2560 


4346 


6132 


784CIP2B_;458 


6727 


775 


2561 


4347 


6133 


7B4CIP2B_459 


6730 


776 


2562 


4348 


6134 


784CIP2B_460 


6730 


777 


2563 


4349 


6135 


784CIP2B_461 


6730 


778 


2564 


4350 


6136 


784CIP2B_462 


6732 


779 


2565 


4351 


6137 


784CIP2B_463 


6733 


780 


2566 


4352 


6138 


784CIP2B_464 


6737 


781 


2567 


4353 


6139 


784CIP2B_465 


6745 


782 


2568 


4354 


6140 


784CIP2B 466 


6751 


783 


2569 


4355 


6141 


784CIP2B_467 


6754 


784 


2570 


4356 


6142 


784CIP2B_468 


6758 


785 


2571 


4357 


6143 


784CIP2B_469 


6761 


786 


2572 


4358 


6144 


784CTP2B_470 


6765 


787 


2573 


4359 


6145 


784CIP2B_471 


6768 


788 


2574 


4360 


6146 


784CIP2B_472 


6773 


789 


2575 


4361 


6147 


784CIP2B_473 


6776 


790 


2576 


4362 


6148 


784CIP2B_4 74 


6796 


791 


2577 


4363 


6149 


784CIP2B_475 


6798 


792 


.2578 


4364 


6150 


784CIP2B_476 


6823 


793 


2579 


4365 


6151 


784CIP2B_477 


6825 


794 


2580 


4366 


6152 


784CIP2B_478 


6826 


795 


2581 


4367 


6153 


784CIP2B_479 


6839 


796 


2582 


4368 


6154 


784CIP2B_480 


6844 


797 


2583 


4369 


6155 


784CIP2B_482 


6849 


798 


2584 


4370 


6156 


784CIP2B_483 


6854 


799 


2585 


4371 


6157 


784CIP2B_484 


6857 


800 


2586 


4372 


615B 


784CIP2B_485 


6861 


801 


2587 


4373 


6159 


784CIP2B_486 


6873 


802 


2588 


4374 


6160 


784CIP2B 487 


6875 


803 


2589 


4375 


1 6161 


784CIP2B__488 


6877 
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SEQ ID NO: 
of full- 
length 
nucleotide 
secjuence 


SEQ ID 
NO: of 
full- 
length 
peptide 
secjuence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket nurnber_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


804 


2 S 9 0 


A 1 7C 


616 2 


784CIP2B 489 


6880 


805 




A "3 *7"7 

437 / 


6163 


784CIP2B 490 


6885 


806 


Z b y <L 


4 3 78 


6164 


784CIP2B 491 


6890 


807 


2593 


4379 


6165 


7 84CIP2B 492 


6890 


808 


2 594 


4380 


6166 


784CIP2B_493 


6894 


809 




4 3 81 


6167 


784CIP2B 494 


6901 


810 


2596 


4382 


6168 


784CIP2B 495 


6904 


811 


2597 


4383 


f 1 

6169 


784CIP2B 496 


6907 


BIO 


2598 


4384 


6170 


784CIP2B 497 


6914 


Old 


2599 


7 -a of 

4385 


6171 


784CIP2B 498 


6917 


8 14 


2600 


4386 


6172 


784CIP2B 499 


6923 


815 


2601 


A 1 a*j 

4387 


6173 


784CIP2B 500 


6929 


816 


2602 


43B8 


6174 


784CIP2B_501 


6931 


817 


2 603 


4389 


6175 


784CIP2B 502 


6935 


818 


2604 


4390 


6176 


784CIP2B_503 


6940 


81S 


2605 


4391 


6177 


784CIP2B_504 


6945 


820 


2606 


4392 


6178 


784CIP2B_505 


6946 


821 


2607 


4393 


6179 


784CIP2B 506 


6947 


o on 

822 


260 8 


4394 


6180 


784CIP2B 507 


6949 


823 


2609 


4395 


6181 


784CIP2B_508 


6959 


824 


2610 


4396 • 


6182 


784CIP2B 509 


6960 


825 


2611 


4397 


6183 


784CIP2B 510 


6962 


8 26 


2612 


4398 


6184 


784CIP2B511 


6963 


827 


2613 


4399 


6185 


784CIP2B 512 


6967 


3 28 


2614 


4400 


6186 


784CIP2B_513 


6983 


829 


2615 


4401 


6187 


784CIP2B 514 


5988 


830 


2616 


4402 


6188 


784CIP2B 515 


6996 


831 


2617 


4403 


6189 


784CIP2B 516 


7003 


832 


2618 


4404 


6190 


784CIP2B_517 


7016 


833 


2619 


4405 


6191 


784CIP2B 518 


7017 


834 


2620 


4406 


6192 


784CIP2B 519 


7025 


o "3 c: 


2621 


4407 


6193 


784CIP2B_520 


7025 


QIC 


2622 


4408 


6194 


784CIP2B 521 


7025 


837 


2623 


4409 


6195 


784CIP2B 522 


7050 


83 B 


2624 


4410 


6196 


784CIP2B_523 


7051 


839 


o/?nc 

2625 


4 411 


6197 


784CIP2B_524 


7055 


4777S — 
840 


2626 


4412 


6198 


784CIP2B_525 


7060 


841 


2 62 7 


4413 


6199 


784CIP2B_526 


7064 


842 


262 8 


4414 


6200 


784CIP2B 527 


7067 


G A 7 
Oft j 


2629 


4415 


6201 


784CIP2B_528 


7071 


844 




4416 


6202 


784CIP2B_529 


7072 


845 


2631 


4417 


6203 


784CIP23 530 


7073 


Q A £ 




4418 


6204 


784CIP2B_531 


7076 




2 633 


4419 


6205 


784CIP2B_532 


708B ] 


84 8 




4420 


6206 


784CIP2B_533 


7089 


84 9 


O CI c 


4421 


6207 


784CIP2B 534 


7091 


850 




4422 


6208 


784CIP2B 535 


7091 


QC1 

o?i 


ZbJ / 


4423 


6209 


784CIP2B 536 


7104 


852 


jo 


4424 


6210 


784CIP2B 537 


7105 




Z O J y 


4425 


6211 


784CIP2B_538 


7105 


854 




4426 


6212 


784CIP2B 539 


7109 


arc; 




4427 


6213 


784CIP2B 540 


7109 


856 


2642 


4428 


6214 




7119 


857 


2643 


4429 i 


6215 


784CIP2B 542 


7120 


858 


2644 


4430 


6216 


784CIP2B_543 


7121 


859 


2645 


4431 


6217 


784CIP2B 544 


7126 


860 


2646 


4432 


6218 


784CIP2B__545 


7127 


861 


2647 


4433 


6219 


784CIP2B_546 


7130 


862 j 


2648 


4434 


6220 


784CIP2B 547 


7131 


863 


2649 


4435 


6221 


784CIP2B 548 


7144 


864 


2650 


4436 


6222 


784CIP2B_549 


7159 


865 


2651 


4437 


6223 


784CIP2B 550 


7163 i 
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SEQ ID NO: 
of full- 


SEQ ID 
NO: of 


SEQ ID NO: 
of contig 


SEQ ID 
NO : 


Priority 

^ n ^ ' ' /-i »- ni i mV\£a r~ 

qockgl nuiuijc t 


SEQ ID 
invj : in 


length 


full- 


nucleotide 


of contig 


cor ic B»£jun a .Lii^j 


U S S . N . 


nucleot ide 


length 


S£CJU6nC6 


pep i_ x ue 


SEQ ID NO: in 


09/488 , 725 


secjuence 


pep t ide 




sequence 


priority 
appl ication 




866 


2652 


4438 


6224 


784CIP2B_551 


7175 


867 


2653 


4439 


6225 


784CIP2B_552 


7188 


868 


2 654 


4440 


6226 


784CIP2B_553 


7189 


DC Q 

OOP 


26 55 


4441 


6227 


784CIP2B 554 


7190 


870 


2656 


4 442 


622 8 


784CIP2B 555 


7191 


871 




4 4 4 3 


622 9 


784CIP2B 556 


7203 


872 


2658 


4 444 


6230 


784CIP2B 557 


7204 


873 




4 445 


6231 


7S4CIP2B 550 


7208 


874 


2660 


4 446 


6232 


784CIP2B 559 


7209 


875 


2661 


4 447 


623 3 


784CIP2B 560 


7210 


876^ 


^ b 


444 8 


6234 


784CIP2B 561 


7216 


877 


2663 


4 44 9 


623 5 


784CIP2B 562 


7221 


87 8 


2664 


4 4 50 


623 6 


784CIP2B 563 


7230 


879 


2665 


4451 


623 7 


7B4PTP2B 564 


723 7 


oon 
ooU 


n c cc 
bo 




6238 


784CIP2B 565 


7240 


881 


4 bb / 




623 9 


784CIP2B 566 


7245 


882 


DO 


a ^ c; a 


624 0 


784CIP2B 567 


7250 


883 


2669 


4455 


bZ4i 




7251 


884 


2670 


443D 


6*242 


784CIIP2B 569 


7255 


885 


26 71 


j AC? 
n43/ 


6243 


7B4PTP2B 570 


7260 


886 


2672 


i /ICR 

4i3D 


6 244 


784CIP2B 571 


7265 


887 


ZD / J 


4433 


6 24 5 


784CIP2B 572 


7268 


888 


2674 


44bU 


6246 


784CXP2B 573 


7275 


889 


2 b 


A A CI 


6247 


7fl4PTP2B 574 


7279 


890 


2676 


4462 


6 24 8 


784CIP2B 575 


7283 


891 


2677 


4 4 63 


6249 


784CIP2B 576 


7283 


892 


2678 


44 64 


6 250 


784CIP2B 577 


7287 


893 


^b 


44 65 


6251 


784CIP2B 578 


7301 


894 


26 80 


4 4 66 


6252 


784CIP2B 579 


73 08 


895 


2681 


4 4 67 


6253 


7fl4f*TP2R 580 


7308 


896 


2682 


44 68 


6 254 




7309 


897 


2 683 


4 4 69 


6255 


784CIP2B 582 


7319 


898 


2684 


a Am 
44 / U 


c o cc 


7fi4CIP2B 583 


7320 


899 


2685 


A AT1 
44 / l 


6257 


784CIP2B 584 


7326 


90 0 


2 6 8 6 


AAT) 
4 4 / *J 


6258 


784CIP2B 585 


7326 


901 


26 B7 


4 4 / J 


6259~ 


784CIP2B 586* 


7334 


ono 

902 


2688 


AATA 
4 4/4 


6260 


7S4CIP2B 587 


7337 


903 




4475 


6261 


784CIP2B 588 


7339 


904 


26 90 


AA7C 
44 / b 


6262 


784CTP2B 589 


7344 






4477 


'6263 


784CIP2B 590 


7355 




26"$2 


44?8 


6*2^4 


784CIP2B 591 


7363 


on i 


26 93 


4479 


6265 


784CIP2B 592 


7363 


q n d 


26 94 


44 8 0 


6266 


784CIP2B 593 


7365 






4 4 81 


6267 


784CIP2B 594 


7368 


910 


2696 


44 82 


6268 


784CIP2B 595 


7369 


911 


2697 


44 83 


6269 


784CIP2B 596 


7372 


912 


2698 


44 84 


6270 


784CIP2B 599 


7375 


913 


2699 


44 85 


6271 


784CIP2B 600 


7381 


914 


2700 


44 86 


6272 


784CIP2B 601 


7383 


QIC 


2701 


4 4 8 7 


62 73 


784CIP2B 602 


73 8 7 


QIC 


2702 


44 88 


6 2 74 


784CIP2B 603 


7391 


917 


2703 


44 89 


6275 


784CIP2B_604 


7393 


918 


2704 


4490 


6276 


784CIP2B_605 


7395 


919 


2705 


4491 


6277 


7B4CIP2B_606 


7397 


920 


2706 


4492 


6278 


784CIP2B_607 


7399 


921 


2707 


4493 


6279 


784CIP2B 608 


7405 


922 


2708 


4494 


6280 


784CIP2B_609 


7406* 


923 


2709 


4495 


6281 


784CIP2B 610 


7406 


924 


2710 


4496 


6282 


784CIP2B_611 


7409 


925 


2711 


4497 


6283 


784CIP2B 612 


7410 


926 


2712 


4498 


6~284 


784«P2B_6'13 


7411 


927 


2713 


4499 


6285 


784CIP2B_614 


7417 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO : 


docket nuraber_ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U . S . S .N . 


nucleotide 


length 


sequence 


pep t z.de 


c cn t n mo . i n 
o&u iu : in 


ftQ I A Qfl 71 C 


sequence 


pept ide 
sequence 




ocquciice 


x lUi J 

a r>T) 1 i f~"fl t" "1 on 

CI X X V- O l» X w 1 1 






"5 1 1 A 


4500 


62 86 


784CIP2B 615 


7418 




^ '13 


4501 


6287 


784CIP2B 616 


7421 




2716 


4502 


62 8 8 


784CIP2B 617 


7422 


J7 O J. 


2717 


4503 


6289 


784CIP2B 618 


7422 


932 


2718 


4504 


62 90 


784CIP2B 619 


7423 


93 3 


971 q 


4 505 


6291 


7R4CIP7R 670 


7424 




2720 


4506 


6292 


784CIP2B 621 


7426 


93 5 


£. f £.1. 


4507 


6293 


7R4CIP2B 6?2 


7427 


936 


2 722 


4508 


62 94 


7R4r , TP9B 63^ 


7428 


93 7 


2723 


4509 


6295 


7fidPTD?B £*24 

/ OILlfiO D *L t 


74 3 0 


938 


2724 




6296 


7R4rTD0tt flOQ 
/D4LXrZo 0^3 


7435 


939 


2 725 


4511 


CTQ7 
O Z J / 




7437 


940 


2726 


4512 


62 98 


TDA r"TD7D Con 
ro4LirZo / 


/ ft J 3 


941 


2727 


4513 


62 9 9 




74 40 


942 


2728 


A CI A 
4314 


£1 Aft 


/o4^.1rZ D O 4 zf 


74 4 2 


943 


2 729 


4515 


bJUl 


TQAPTD^O Ci ft 


74 50 


944 


2730 


4516 




/OSLir^fl^t) J J. 


74 51 


945 


2 731 


4517 


6 3 03 


r/o bjz 


74 52 


946 


2 732 


4518 


63 04 


/o4tll , i ii b j j 


7i CA 


947 


2 733 


4519 


63 05 


TQifTDIQ C.1 A 

/o4Llr^D oj4 


7A cn 


948 


2734 


4520 


b J Ub 


/DSll r4D O J D 


74 59 


949 


2 735 


4521 


63 07 


/ D4Ulr/ D Oj O 


/ ft 0 j. 


950 


2736 


4522 


63 0 8 


/B4v.lF<n bj / 


74 63 


951 


2737 


4523 


C "3 ft o 




74 66 


952 


2 73 8 


4 524 


6310 




74 69 


953 


2 739 


4525 


6311 


*7 flA D*5 Q CAd 


74 73 


954 


2 740 


4526 


6312 


TBAnTDTQ CAT 
/D4Llrzn o4 1 


74 81 


955 


2 741 


4527 




/o4LJlrJa b4i 


74 82 


956 


2742 


4528 


6314 




/ * O 6 


957 


2 743 


4529 


6315 


7fl4PTD9"a P. A A 


74 83 


958 


2744 


4530 


O J X D 


/UTV^lr/cJ D4 J 


74 8 5 


959 


2 745 


43J1 


6317 


/ Ofl IT AO DfiO 


7 4 86 


960 


2746 


4 532 


6318 




74 87 


961 


2747 






7fl4PTP^^ fid ft 


74 91 


962 


2 748 


4534 


O J <6 U 


7ftAr*Tt59W <!dQ 


7 4 92 


963 


2 749 


4 D J D 


C 1 "5 1 




74 94 


964 


Z. f dU 


* 3 J O 


6322 


7B4r"TD9P fim 
/o*i^irZD DjJ. 


7498 




n <7 c t 


4 537 


6323 


7R4CIP2B 652 


7504 


966 


"5 TO 
Z / 5z 


4538 


6324 


7R4CIP2B 6*53 


7508 


967 


2753 


4539 


b jiD 


/B4LlrZB bjl 


7515 




^7^4 


4540 


6"326" 


784itP2B ^55 


7518 


969 


77CC 


4 54 1 


63 2 7 


7R4rTP?R £56 


7519 


3 / u 


2 756 


4 542 


63 28 


7R4CIP2B 657 


7521 


971 


*) "7 "7 


4 54 3 


6329 


7R4TTPPB fi'ofl 

/OVV«Xr<bO O JO 


7529 


cno 
j» / x 


2 758 


4 544 


633 0 


794CIP2B 659 


7532 


973 


2759 


4 545 


6331 


784CIP2B 660 


7533 


974 


2760 


4546 


633 2 


7R4CIP2S 661 


7535 


975 


2761 


4 54 7 


63 3 3 


7R4CIP2B 662 


7545 


976 


2762 


454 8 


6334 


784CIP2B 663 


7546 j 


j / / 


2763 


4 54 9 


6 3 3 5 


7R4PIP2B 664 


7552 


Q7fl 

J / 0 


2 764 


4550 


633 6 


7R4CIP2B 665 


7554 


979 


2765 


4551 


6337 


784CIP2B_666 


7567 


980 


27(56- 


4552 


6338 


784CIP2B_667 


7569 


981 


2767 


4553 


6339 


784CIP2B_668 


7575 


982 


2768 


4554 


6340 


784CIP2B_669 


7576 


983 


2769 


4555 


6341 


784CIP2B_670 


7577 


984 


2770 


4556 


6342 


784CIP2B_671 


7579 


985 


2771 


4557 


6343 


784CIP2B_672 


7582 


986 


2772 


4558 


6344 


784CIP2B_673 


7587 


987 


2773 


4559 


6345 


784CIP2B_674 


7589 


988 


2774 


4560 


6346 


784CIP2B_675 


7597 


989 


27 75 


4561 


6347 


784CIP2B 676 


7597 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priori ty 


say 


of full- 


NO: of 


of contig 


NO: 


docket number 


NO : in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




990 


2776 


4562 


6348 


784CIP2B 677 


7609 


991 


2777 


4563 


6349 


784CIP2B_678 


7609 


992 


2778 


4564 


6350 


784CIP2B 679 


7609 


993 


2779 


4565 


6351 


784CIP2B 680 


7613 


994 


27B0 


4566 


6352 


784CIP2B 681 


7623 


995 


2781 


4567 


6353 


784CIP2B 682 


7629 


996 


2782 


4568 


6354 


784CIP2B_683 


7630 


997 


2783 


" 4569 


£355 


784CIP2B 684 


7633 


998 


2784 


4570 


6356 


784CIP2B 685 


7635 


999 


2785 


4571 


6357 


784CIP2B_686 


7638 


1000 


2786 


4572 


6358 


784CIP2B 687 


7639 


1001 


2787 


" 4573 


6359 


784CIP2B 688 


7646 


1002 


2788 


4574 


6360 


784CIP2B 689 


7647 


1003 


2789 


4575 


6361 


784CIP2B 690 


764 8 


1004 


2790 


4576 


6362 


784CIP2B_691 


7658 


1005 


2791 


4577 


6363 


784CIP2B 692 


7664 


1006 


2792 


4578 


6364 


784CIP2B 693 


7664 


1007 


2793 


4579 


6365 


784CIP2B_695 


7674 


1008 


2794 


4580 


6366 


784CIP2B 696 


7675 


1009 


2795 


4581 


6367 


784CIP2B_697 


7676 


1010 


2796 


4582 


6368 


784CIP2B 698 


7681 


1011 


2797 


4583 


6369 


784CIP2B_699 


768B 


1012 


2798 


4584 


6370 


784CIP2B 700 


7693 


1013 


2799 


4585 


6371 


784CIP2B_701 


76"94 


1014 


2800 


4586' 


6372 


784CIP2B_702 


7715 


1015 


2801 


4587 


6373 


784CIP2B 703 


7716 


1015 


2802 


4588 


6374 


784CIP2B 704 


7718 


1017 


2803 


4589 


" 6375 


784CIP2B_705 


7721 


1018 


2B04 


4590 


6376 


784CIP2B 706 


7723 


1019 


2805 


4591 


6377 


784CIP2B 707 


7729 


1020 


2806 


4592 


6378 


784CIP2B_708 


7733 


1021 


2807 


4593 


6379 


784CIP2B 709 


7735 


1022 


2808 


4594 


6380 


784CIP2B 716 


7741 


1023" 


2809 


4595 


6381 


784CIP2B 711 


7743 


1024 


2810 


4596 


6382 


784CIP2B_712 


7748 


1025 


2811 


4597 


6383 


784CIP2B_713 


7749 


1026 


2812 


4598 


6384 


784CIP2B 714 


7750 


1027 


2813 


4599 


6385 


784CIP2B_715 


7757 


1028 


2814 


4600 


6386 


784CIP2B 716 


7759 


1029 


2815 


4601 


6387 


784CIP2B_717 


7760 


1030 


2816 


4602 


6388 


784CIP2B_718 




1031 


2817 


4603 


6389 


784CTP2B_719 


7764 


1032 


2616 


4604 


6390 


784CIP2B_720 


7765 


1033 


2819 


4605 


6391 


784CIP2B 721 


7766 


1034 


2820 


4606 


6392 


784CIP2B 722 


77«J7 


1035 


2821 


" 4607 


6393 


784CIP2B 723 


7769 


1036 


" 2822 


4608 


63 94 


784CIP2B 724 


7770 


1037 


2823 


4609 


6395 


784CIP2B 725 


7774 


1038 


2824 


4610 


6396 


784CIP2B 726 


7779 


1039 


2825 


4611 


£397 


784CIP2B 727 


77B1 


104 0 


2826 


4612 


6398 


784CIP2B 728 


7782 


1041 


2827 


4613 


6399 


784CIP2B 729 


77B3 


1042 


2828 


4614 


6400 


784CIP2B 730 


7787 


1043 


2829 


4615 


6401 


784CIP2B_731 


7792 


1044 


2830 


4616 


6402 


784CIP2B_732 


7795 


1045 


2831 


4617 


6403 


784CIP2B_733 


7801 | 


1046 


2832 


4618 


6404 


784CIP2B 734 


7807 | 


1047 


2833 


4619 


6405 


784CIP23 735 


7808 


1048 


2834 


4620 


6406 


784CIP2B_736 


7819 


1049 


2835 " 


4621 


6407 


784CIP2B_737 


7824 


1050 


2836 


4622 


6408 


784CIP2BJ738 


7826 


1051 


2837 


4623 


6409 


7B4CIP2B 739 


7829 J 
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c fo td wn - 
sty x u rju . 


SEQ XD 


oby i u wy : 


SEQ XD 


Priority 


SEQ ID 


of full- 


NO : of 




NO • 


docket number 


NO: in 


length 


full- 


nucleot ide 


of c nn t" i a 




TT Q C W 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 




sequence 


peptide . 




sequence 


priority 






sequence 






application 




1052 


2838 


4624 


6410 


784CIP2B_74 0 


7832 


1053 


2839 


4625 


6411 


784CIP2B_741 


7839 


1054 


2840 


4626 


6412 


784CIP2B_743 


7847 


1055 


2841 


4627 


6413 


784CIP2B_744 


7848 


105S 


2842 


4628 


6414 


784CIP2B_745 


7853 


1C57 


2843 


4629 


6415 


784CIP2B 746 


7854 


1058 


2644 


4630 


6416 


784CIP2B_747 


7856 


1059 


2845 


4631 


6417 


784CIP2B_74 8 


78 62 


1060 


2846 


4632 


6418 


784CIP2B 749 


7865 


1061 


2847 


4633 


6419 


784CIP2B 750 


7874 


1062 


2848 


4634 


6420 


784CIP2B 751 


7877 


1063 


2849 


4635 


6421 


784CIP2B 752 


7880 


1064 


2850 


4636 


6422 


784CIP2B 753 


7 882 


1065 


2851 


4637 


6423 


784CIP2B 754 


7884 


1066 


2852 


4638 


6424 


784CIP2B 755 


7 8 86 


1067 


2853 


4639 


6425 


784CIP2B 756 


7888 


106B 


2854 


4640 


6426 


" 784CIP2B 757 


7 8 89 


1069 


2855 


4641 


6427 


784CIP23 758 


7901 


1070 


2856 


4642 


6428 


784CIP2B 759 


7910 


1071 


2857 


4643 


6429 


784CIP2B 760 


7911 


1072 


2858 


4644 


6430 


784CIP2B 761 


7921 


1073 


285S 


4645" 


6431 


784CIP2B 762 


7923 


1074 


2860 


4646 


6432 


784CIP2B 763 


7 924 


1075 


2861 


4647 


6433 


784CIP2B 764 


7 925 


1076 


2862 


4648 


6434 


7S4CIP2B 765 


7928 


1077 


2863 


4649 


643 5 


784CIP2B 766 


7929 


1078 


2864 


4650 


6436 


784CIP2B 767 


7 930 


1079 


2665 


4651 


6437 


784CIP2B 768 


7934 


1080 


2866 


4652 


643 8 


784C1P2B 769 


793 8 


1081 


2867 


4653 


6439 


784CIP2B 770 


7942 


1082 


2868 


4654 


6440 


784CIP2B 771 


7945 


1083 


2869 


4655 


6441 


784CIP2B 772 


7946 


10B4 


2870 


4656 


6442 


784CIP2B 773 


7948 


1085 


2871 


4657 


6443 


784CIP2B 774 


7951 


1086 


2872 


4658 


6444 


784CIP2B 775 


7952 


1087 


2873 


4659 


6445 


784CIP2B 776 


7953 


1088 


2874 


4660 


6446 


784CIP2B 777 


7954 


1089 


2875 


4661 


6447 


784CIP2B 778 


7957 


1090 


2876 


4662 


6448 


784eiP2B_7?9 


7958 


1091 


2877 


4663 


6449 


784CIP2B 7B0 


7961 


1092 


2878 


4664 


6450 


784CIP2B 781 


7965 


1093 


2879 


4665 


6451 


784CIP2B 782 


7966 


1094 


2880 


4666 


6452 


784CIP2B 783 


7979 


1095 


2881 


4667 


6453 


784CIP2BJ7B4 


7986 


1096 


2882 


4668 


6454 


784CIP2B 785 


7986 


1097 


2883 


4669 


6455 


784CIP2B_786 


7988 


1098 


2884 


4670 


6456 


7B4CIP2B_7B7 


7991 


1099 


2885 


4671 


6457 


784CIP2B_7B8 


7992 


1100 


2886 


4672 


6458 


784CIP2B 789 


7992 


1101 


2887 


4673 


6459 


784CIP2B 790 


7992 


1102 


2888 


4674 


6460 


784CIP2B 791 


7992 


1103 


2889 


4675 


6461 


784CIP2B_792 


8003 


1104 


2890 


4676 


6462 


784CIP2B_793 


8014 


1105 


2891 


4677 


6463 


784CIP2B_794 


8015 


1106 


2892 


4678 


6464 


784CIP2B 795 


8016 


1107 


2893 


4679 


6465 


784CIP2B_796 


8017 


1108 


2894 


4680 


6466 


784CIP2B_797 


8019 


1109 


2895 


4681 


6467 


784CIP2B_798 


8020 


1110 


2896 


4682 


6468 


784CIP2B_799 


8022 


1111 


2897 


4683 


6469 


784CIP2B 800 


8022 


1112 


2898 


4684 


6470 


784CIP2B_801 


8028 


1113 


2899 


4665 


6471 


784CIP2B_802 


8030 
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SEQ ID NO: 
of full- 
length 


SEQ ID 
NO: of 
full- 


SEQ ID NO; 
of contig 
nucleotide 


SEQ ID 
NO: 

of contig 


Priority 
docket nurnber 
corresponding 


NO: in 
U.S .S .N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


sequence 


peptide 
seguence 




sequence 


priority 
application 




1114 


2900 


4686 


6472 


784CIP2B__603 


8038 


1115 


2901 


4687 


6473 


784CIP2B 804 


8042 


1116 


2902 


4688 


6474 


784CIP2B_805 


8045 


1117 


2903 


4689 


6475 


784CIP2B__806 


8045 


1118 


2904 


4690 


6476 


784CIP2B_807 


8046 


1119 


2905 


4691 


6477 


784CIP2B 808 


8047 


1120 


2906 


4692 


6478 


784CIP2B 809 


8051 


1121 


2907 


4693 


6479 


784CIP2B_810 


8059 


1122 


2908 


4694 


6480 


784CIP2B 811 


8064 


1123 


2909 


4695 


6481 


784CIP2B_812 


8069 


1124 


2910 


4696 


6462 


784CIP2B 813 


8074 


1125 


2911 


4697 


6483 


784CIP2B 814 


8077 


1126 


2912 


4698 


6484 


7B4CIP2B 815 


8078 


1127 


2913 


4699 


6485 


784CIP2B_816 


8079 


1128 


2914 


4700 


6486 


784CIP2B 817 


8084 


1129 


2915 


4701 


6487 


! 784CIP2B 818 


8088 


1130 


2916 


4702 


6488 


784CIP2B 819 


8090 


1131 


2917 


4703 


6489 


784CIP2B 820 


8091 


1132 


2918 


4704 


6490 


784CIP2B 821 


8099 


1133 


2919 


4705 


6491 


784CIP2B 822 


8099 


1134 


2920 


4706 


6492 


784CIP2B 823 


8100 


1135 


2921 


4707 


6493 


784CIP2B_824 


8102 


1136 


2922 


4708 


6494 


784CIP2B_825 


8103 


1137 


2923 


4709 


6495 


784CIP2B 826 


8103 


1138 


2924 


4710 


6496 


784CIP2B_827 


8104 


1139 


2925 


4711 


6497 


784CIP2B 828 


8108 


1140 


2926 


4712 


6498 


784CIP2B 829 


8110 


1141 


2927 


4713 


6499 


784CIP2B 830 


8116 


1142 


2928 


4714 


6500 


784CIP2B_83l 


8117 


1143 


2929 


4 715 


6501 


784CIP2B 832 


8123 


1144 


2930 


4716 


6502 


784CIP2B 833 


B130 


1145 


2931 


4717 


6503 


784CIP2B_834 


8130 


1146 


2932 


4718 


6504 


784CIP2B_835 


8143 


1147 


2933 


4719 


6505 


784CIP2B 836 


8143 


1148 


2934 


4720 


6506 i 


784CIP2B 837 


8154 


1149 


2935 


4721 


6507 


784CIP2B 838 


8155 


1150 


2936 


4722 


6508 


784CIP2B 839 


8162 


L 1151 


. 2937 


4723 


6509 


784CIP2B_840 


8163 


1152 


2938 


4 724 


6510 


784CIP2B 841 


8172 


1153 


2939 


4725 


6511 


784CTP2B_842 


8173 


1154 


2940 


4726 


6512 


784CIP2B_843 


8179 


1155 


2941 


4727 


6513 


784CIP2B 844 


8182 


| 1156 


2942 


4728 


6514 


784CIP2B 845 


8183 


1157 


2943 


4729 


6515 


784CIP2B 846 


8184 


1158 


2944 


4730 


6516 


784CIP2B_847 


8185 


1159 


2945 


4 731 


6517 


784CIP2B 848 


8187 


1160 


2946 


4732 


6518 


784CIP2B 849 


8188 


1161 


2947 


4733 


6519 


784CIP2B_850 


8190 


1162 


2948 


4734 


6520 


784CIP2B_851 


8190 


1163 


2949 


4735 


6521 


784CIP2B 852 


8192 


1164 


2950 


4736 


6522 


784CIP2B 853 


8193 


1165 


2951 


4737 


6523 


784CIP2B 854 


8197 


1166 


2952 


4738 


6524 


784CIP2B_855 | 


8197 


1167 


2953 


4739 


6525 


784CIP2B_856 


8199 


1168 


2954 


4740 


6526 


784CIP2B_857 


8202 


1169 


2955 


4741 


6527 


784CIP2B_858 


8203 


1170 


2956 


4742 


6528 


784CIP2B 859 


8208 


1171 


2957 


4743 


. 6529 


784CIP2B_860 


8209 


1172 


2958 


4744 


6530 


784CIP2B 861 


8211 


1173 


2959 


4745 


6531 


784CIP2B_862 


8214 


1174 


2960 


4746 


6532 


784CIP2B_863 


8217 


1175 


2961 


4747 


6533 


784CIP2B_864 


8223 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


117 6 


. 2962 


4748 


6534 


784CIP2B_865 " 


8224 


1 1 

1 X 1 I 


2963 


4749 


6535 


784CIP2B 866 


8226 


i 1 to 

11 / a 


2964 


4750 


6536 


784CIP2B 867" 


8227 


1 1 0 Q 

x x / y 


2965 


4751 


6537 


784CIP2B 868 


8229 


1180 


2966 

^ r>/--> 


4752 


6538 


784CIP2B 869 


8232 


1181 


2967 


4753 


6539 


784CIP2B 870"" 


"~ 823 6 


1182 


2968 


4754 


6540 


784CIP2B 871 


8239 


1183 


2969 


4755 


6541 


784CIP2B_872 


8244 


1184 


2970 


4756 


6542 


784CIP2B 873 


] 8245 


1185 


2971 


4757 


6543 


784CIP2B 874 " 


8248 


11 86 


2972 


4758 


6544 


784CIP2B 875 


8251 


1187 


2973 


4759 


6545 


784CIP2B 876 


8253 


1188 


2974 


4760 


6546 


784CIP2B 877 


8260 


1189 


2975 


4761 


6547 


784CIP2B 878 


8262 


1190 


2976 


4762 


6548 


784CIP2B 879 


8268 


1191 


2977 


4763 


6549 


784CIP2B 8B0 


8270 


1192 


2978 


4764 


6550 


784CIP2B 881 


8272 


1193 


2979 


4765 


6551 


784CIP2B 882 


8274 


1194 


2980 


4766 


6552 


784CIP2B_883 


8274 


1195 


2981 


4767 


6553 


784CIP2B 884 


8275 


1196 


2982 


4768 


6554 


784CIP2B 885 


8277 


1197 


2983 


4769 


6555 


784CIP2B 8B6 


8281 


1198 


2984 


4770 


5556 


784CIP2B_887 


8283 


1199 


2985 


4771 


6557 


784CIP2B 888 


8289 


1200 


2986 


4772 


6558 


784CIP2B 889 


8295 


1201 


2987 


4773 


6559 


784CIP2B 890 


8300 


1202 


2988 


4774 


6560 


784CIP2B 891 


8303 


1203 


2989 


4775 


6561 


784CIP2B 892 


8304 


12 04 


2990 


4776 


6562 


784CIP2B 893 


8305 


1205 


2991 


4777 


6563 


784CIP2B 894 


8309 


120 6 


2992 


4778 


6564 


784CIP2B 895 " 


8318 " 


120 7 


2993 


4779 


6565 


784CIP2B_896 


8319 


1208 


2994 


4780 


6566 


784CIP2B_897 


8321 


1209 


2995 


4781 


6567 


784CIP2B_898 


8322 


1210 


2996 


4782 


6568 


784CIP2B 899 


8323 


1211 


2997 


4783 


6569 


784CIP2B 900 


8325 


1212 


2998 


4784 


6570 


784CIP2B 901 


8331 


1213 


2999 


4785 


6571 


784CIP2B 902 


8332 


1214 


3000 


4786 


6572 


784CIP2B 903 


8333 


1215 


3 001 


4787 


6573 


784CIP2B_904 


8335 


m c 

-Ljilb 


3002 


4788 


6574 


784CIP2B 905 


8336 


1217 


3003 


4789 


6575 


784CIP2B__905 


8337 


1218 


3004 


4790 


6576 


784CIP2B 907 


8340 


1219 


3005 


4791 


6577 


784CIP2B 908 


8343 


1^20 


3006 


4792 


6578 


784CIP2B 909 


8347 " 


X4*± 


3 007 


4793 


6579 


784CIP2B_910 


8349 


1222 


3008 


4794 


6580 


784CIP2B 911 


B351 


122 3 


3 009 


4795 


6581 


784CIP2B_912 


835-3 


Xzz 4 


3010 


4796 


6582 


784CIP2B 913 


8355 


1 OO C 


3011 


4797 


6583 


784CIP2B 914 


8361 


122 6 


3 012 


4798 


6584 


784CIP2B 915 


8365 


1227 


3013 


4799 


6585 


784CIP2B 916 


8367 


1228 


3014 


4800 


C CQC 
D DOD 


784CIP2B 917 


8369 


1229 


3015 


4801 


6587 


784CIP2B_919 


8375 


123 0 


3016 


4802 


6588 


784CIP2B 920 


8387 


1231 


3017 


4803 


6589 


7B4CIP2B_921 


8391 


1232 


3018 


4804 


6590 


784CIP2B 922 


8393 


1233 


3019 


4805 


6591 


784CIP2B 923 


8393 


1234 


3020 


4806 


6592 


784CIP2B_924 


8394 


1235 


3021 


4807 1 


6593 


784CIP2B 925 


8395 


1236 


3022 


4808 


6594 


784CIP2B_926 


" 8396" 


1237 


3023 


4809 


6595 


784CIP2B 927 


8398 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S .S.N. 
09/488, 725 


1238 


3024 


4810 


6596 


784CIP2B_928 


8402 


1239 


3025 


4811 


6597 


784CIP2B 929 


8402 


1240 


3026 


4812 


6598 


784CIP2B 930 


8405 


1241 


3027 


4813 


6599 


784CIP2B 931 


8406 


1242 


3028 


4814 


6600 


784CIP2B 932 


8409 


1243 


3029 


4815 


6601 


784CIP2B 933 


8410 


1244 


3030 


4816 


6602 


7B4CIP2B 934 


8414 . 


1245 


3031 


4817 


6603 


784CIP2B 935 


8415 


1246 


3032 


4818 


" 6604 


784CIP2B 936 


8419 


1247 


3033 


4819 


6605 


784CIP2B 937 


8426 


1248 


3034 


4820 


£606 


784CIP2B 938 


8430 


1249 


3035 


4821 


6607 


784CIP2B 939 


8431 


1250 


3036 


4B22 


6608 


784CIP2B 940 


8432 


1251 


3037 


4823 


6609 


7B4CIP2B_941 


8433 


1252 


3038 


4B24 


6610 


784CIP2B 942 


8434 


1253 


3039 


4B25 


6611 


784CIP2B 943 


8438 


1254 


3040 


4826 


6612 


784CIP2B_944 


8439 


1255 


3041 


4827 


6613 


784CIP2B 945 


8441 


1256 


3042 


4828 


6614 


784CIP2B_946 


8450 


1257 


3043 


4329 


6615 


| 784CIP2B_947 


8451 


1258 


3044 


4830 


6616 


784CIP2B_948 


8452 


1259 


3045 


4831 


6617 


784CIP2B_949 


8460 


1260 


3046 


4832 


6618 


784CIP2B 950 


8461 


1261 


3047 


4 833 


6619 


784CIP2B 951 


8462 


1262 


3048 


4834 


6620 


784CIP2B 952 


8464 


1263 


3049 


4835 


6621 


784CIP2B 953 


8465 


1264 


3050 


4836 


6622 


784CIP2B 954 


8467 


1265 


3051 


4837 


6623 ~ 


784CIP2B 955 


8470 


1266 


3052 


4838 


6624 


784CIP2B 956 


8471 


1267 


3053 


4839 


6625 


784CIP2B_957 


8473 


1268 


3054 


4840 


6626 


784CIP2B 958 


8474 


1269 


3055 


4841 


6627 


784CIP2B_9S9 


8475 


1270 


3056 


4842 


**28 


784CIP2B 960 


8476 


1271 


3057 


4843 


6629 


784CIP2B 961 


8480 


1272 


3058 


4844 


6630 


784CIP2B_962 


8482 


1273 


3059 


4845 


6631 


784CIP2B_963 


8482 


1274 


3060 


4B46 


6632 


784CIP2B 9*4 


8486 


1275 


3061 


4847 


6633 


784CIP2B 965 


8488 


1276 


3062 


4848 


6634 


784CIP2B_966 


8492 


1277 


3063 


4849 


6635 


784CIP2B_967 


8494 


1278 


3064 


4850 


6636 


7B4CIP2B_9*8 


849* 


1279 


3065 


4851 


6637 


784CTP2B_969 


8497 


1280 ■ 


3066 


4852 


6638 


784CIP2B_97 0 


8499 


1281 


3067 


4853 


6639 


784CIP2B 971 


8513 


1282 


3068 


4854 


6640 


784CIP2B 972 


8522 


1283 


3069 


4855 


6641 


784CIP2B 973 


8526 


1284 


3070 


4856 


6642 


784CIP2B 974 


8531 


1285 


3071 


4857 


6643 


784CIP2B_975 


8533 


1286 


3072 


4858 


6644 


784CIP2B_976 


8542 


1287 


3073 


4859 


6645 


784CIP2B 977 


8544 


1288 


3074 


4860 


6646 


784CIP2B_978 


8565 


1289 


3075 


4861 


6647 


784CIP2B_979 


8565 




3076 


4 862 


6648 


784CIP2B_980 


8572 


1291 


3077 


4863 


6649 


784CIP2B 981 


8576 


1292 


3078 


4864 


6650 


784CIP2B 982 


8578 


1293 


3079 


4865 


6651 


784CIP2B_983 


8584 


1294 


3080 


4866 


6652 


784CIP2B_984 


8598 


1295 


3081 


4867 


6653 


784CIP2B 9B5 


8602 


129* 


3082 


4868 


6654 


784CIP2B_986 


8604 


1297 


3083 


4869 


6655 


784CIP2B_987 


8609 


1298 


3084 


4870 


6656 


784CIP2B 988 


8612 


1299 


3085 


4871 


6657 


784CIP2B 989 


8637 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO : 


SEQ ID 


Priority 


SEQ ID 


r^-F fill ~\ - 

or luii- 


iiU : Ox 


of contig 


NO : 


docket nuiuber 


NO : in 




full- 




of conticj 


corrcaponuin^ 


TT O O \T 
U . O .O.N. 


nurlent" idp 


length 




pep i_ J. ue 






sequence 


peptide 
sequence 




sequence 


priority 
application 




1300 


3086 


4872 


6658 


784CIP2B 990 


8640 


1301 


3087 


4873 


6659 


784CIP2B 991 


8643 


1302 


3088 


4874 


6660 


784CIP2B 992 


8645 


1303 


3089 


4B75 


6661 


784CIP2B 993 


8650 


1304 


3090 


4876 


6662 


784PIP7B 994 


8651 


1305 


3091 


4877 


6663 


7B4PTP7R qqc; 


8654 


1306 


3 092 


4878 


6664 


7H4PTP7P. QQfi 


8655 


1307 


3093 


4879 


6665 


784riP9R 9Q7 


8657 


1308 


3094 


4 880 


6666 


7R4PTP7R QQfl 


GOOD 


1309 


3095 


4 881 


6667 


7ft4PTP*3B QQQ 

/ 0^k^-X±:ZO__:7 


OO 0 O 


1310 


3 096 


4 882 


6668 


7R4PTP7U 1 flfin 


OO / I 


1311 


3 097 


4 8 83 




/04Lir40 1UU1 


8672 


1312 


3 0 98 


4 884 


DO / U 


/o4LJ>r6Q lUUz 


8692 


1313 


3 0 99 


4 8 85 


at) / I 


/OflLlr* D X U U O 


8706 


1314 


3 100 


4 8 86 


6 6 72 


7 RAfT POP. 1 Of\A 


DTI £ 

o /lb 


13 15 


3101 


4 8 87 


6673 


7R4PTP7n innc 


Q7T O 

o /xy 


13 IS 


3102 


4 8 88 


O O / fl 


/O^iL.XirZn 1UUD 


O J 


13 17 


3103 


*± a □ y 


ecic 
Do (J 


/ o^Llr id XUU / 


8764 


13 IB 


■J X Wi 


/pan 


bb / q 


/o4Llrzn lUUo 


8764 






J DDI 
«t 0 J X 


bo / / 




8 764 


1320 


3106 


4 8 92 


6678 


/ O *t t~ JL f £ 0_X U X U 


Q 774 


1321 


3107 


4 893 


oo 




0 /oz 


1322 


3108 


4 894 


6680 


7R4P7D0"P. imo 
/ oSLlfifl lUlZ 


H / 3b 


1323 


3109 


4 895 


6681 


7R4PTPOD. im*3 


Dd< / 


13 24 


3110 


4 896 


6682 


*7RAr , TD'5I3 T fil /l 
/O^CXlr^o XVJX^ 


8 842 


1325 


3111 


4 8 97 


668 3 


7R4r > TD">H t m c; 


8 842 


1326 


3 112 


4 898 


6684 


/OfkL.Xr'^o XUXo 


8 858 


1327 


3113 


4 8 99 


6685 


/ Cj^Llrzfl XUX / 


8 B 71 


1328 


3114 


4900 


66 86 


/ O 4 l~-L ir^xs XUXO 


8921 


1329 


3115 


4 901 


66 87 






1330 


3116 


4 902 


66 88 




Q O A O 


1331 


3117 


4 9 0 3 




/0'iv_XirZr5 XUZX 


Q QQA 


133 2 


3 118 


4 904 


6690 


/04L.XF^iJ iUi^! 




1333 




4 905 






9028 


1 "334 


3 1 3f> 
-3 XZ L> 


a one 


0 d j^: 


f o4t,XJr2.D 10^4 


9058 


1335 


3121 


4 907 


fifi 93 

DO "j 


TRAPTDOQ 1 ("IOC 
J OlL.lr<lD XUZO 


9058 


1336 


3122 


4908 


6694 






1337 


3123 


4909 


6^95 


7R4PTP2B 1057 




1338 


3124 


4910 


6696 


7R4PTP5B 1 n?ft 


QfJR7 


1339 


3125 


4911 


6697 


7R4PTP3R 1 Ooq 


9084 


1340 


3126 


4912 


6698 


7H4PIP2B. 1030 

; u t Ux c ZD x, U j U 


9093 


1341 


3127 


4913 


6699 


784P.IP2B 1031 


9101 


1342 


3128 


4914 


6700 




9103 


1343 


3129 


4915 


6701 


784PTP2R. 1033 


9105 


1344 


3130 


4916 


6702 


784CIP2B 1034 


9151 


1345 


3131 


4917 ! 


6703 


784CIP2B 1035 


9161 


1346 


3132 


4918 


6704 


784CIP2B 1036 


9172 


1347 


3133 


4919 


6705 


784CIP2B 1037 


9174 


1348 


3134 


4920 


6706 


7B4CIP2B 1038 


9204 


1349 


3135 


4921 


6707 


784CIP2B 1039 


9234 


1350 


3136 


4922 


6708 


784PTP2T1 1 040 


9235 


1351 


3137 


4923 


6709 


784CIP2B_1041 


9239 


1352 


3138 


4924 


6710 


784CIP2B_1042 


9256 


1353 


3139 


4925 


6711 


784CIP2B_1043 


9276 


1354 


3140 


4926 


6712 


784CIP2B_1044 


9345 


1355 


3141 


4927 


6713 


784CIP2B_1045 


9379 


1356 


3142 


4928 


6714 


784CIP2BJL046 


9435 


1357 


3143 


4929 


6715 


784CIP2B_1047 


9437 


1356 


3144 


4930 


6716 j 


784CIP2B_104B 


9469 


1359 


3145 


4931 


6717 


784CIP2B_1049 


9500 


1360 


3146 


4932 


6718 


784CIP2B_1050 


9502 


1361 


3147 


4933 


6719 


784CIP2B 1051 


9520 
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SEQ ID NO : 
of full- 
length 

ni i f "1 t" id«* 
spouence 


SEQ ID 
NO : of 

full- 
length 
peptide 
sequence 


CT7H T D NO ■ 
f-\~f cont" i cr 
nucleot ide 
sequence 


SEQ ID 

NO : 

of contig 

peptide 

sequence 


Priority 
docket number 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO : in 
U.S. S.N. 
09/488, 725 


1362 


3148 


4934 


6720 


784CIP2B_1052 


9541 


1363 


314 9 


4935 


6721 


784CIP2B_1053 


9541 


1364 


3150 


4936 


6722 


784CIP2B_1054 


9548 


1365 


3151 


4937 


6723 


784CIP2B_1055 


9556 


1366 


3152 


4938 


6724 


784CIP2B_1056 


9556 


1367 


3153 


4939 


6725 


784CIP2B_1057 


9575 


1368 


3154 


4940 


6726 


784CIP2BJL058 


9589 


1369 


3155 


4941 


6727 


7B4CIP2B_1059 


9599 


1370 


3156 


4942 


6728 ! 


784CIP2B_1060 


9602 


1371 


3157 


4943 


6729 


7B4CIP2B_1061 


9606 


1372 


3158 


4944 


6730 


784CIP2B_1062 


9622 


1373 


3159 


4945 


6731 


784CIP2BJL063 


9623 


1374 


3160 


4946 


<J732 


784CIP2B_1064 


9646 


1375 


3161 


4947 


6733 


784CIP2B_1065 


9747 


1376 


3162 


4948 


6734 


784CIP2B_1066 


9773 


1377 


3163 


4949 


6735 


784CIP2B_1067 


9785 


1378 


3164 


4950 


6736 


784CIP2B_1068 


9801 


1379 


3165 


4951 


6737 


784CIP2B_1069 


9811 


1380 


3166 


4952 


6738 


784CIP2B_1070 


9843 


1381 


3167 


"4953 


6739 


784CIP2B_1071 


9854 


13 B2 


3168 


4954 


6740 


784CIP2B 1072 


9854 


1393 


" 3169 


4955 


6741 


784CIP2B_1073 


9864 


13 84 


3170 


4956 


6742 


7B4CIP2B_1074 


9864 


13 85 


3171 


4957 


6743 


784CIP2B_1075 


9871 


13 86 


3172 


4958 


6744 


784CIP2B 1076 


9879 


13 8 7 


3173 


4959 


6745 


784CIP2B_1077 


9881 


13 88 


3174 


4960 


6746 


784CIP2B_1078 


9885 


1389 


3175 


4961 


6747 


784CIP2B_1079 


9901 


1390 


3176 


4962 


6748 


784CIP2B_1080 


9912 


1391 


3177 


4963 


6749 


784CIP2B_1081 


9916 


1392 


3178 


4964 


6750 


784CIP2B_1082 


9921 


1393 


3179 


4965 


6751 


784CIP2B_1083 


9925 


1394 


3180 


4966 


6752 


784CIP2B_1084 


9930 


1395 


3181 


4967 


6753 


784CIP2B_1085 


9949 


1396 


3182 


4968 


6754 


784CIP2B_1086 


9951 


1397 


3183 


4969 


6755 


784CIP2B_1087 


9959 


1398 


3184 


4970 


6756 


784CIP2B_1088 


9973 


1399 


3185 


4971 


6757 


7B4CIP2B_1089 


9982 


1400 


3186 


4972 


6758 


784CIP2B_1090 


9994 


1401 


318 7 


4973 


6759 


784CIP2B_1091 


10021 


1402 


3188 


4974 


6760 


7B4CIP2B_1092 


10041 


1403 


3189 


4975 


6761 


784CIP2B_1094 


10067 


1404 


3190 


4976 


6762 


784CIP2B_1095 


10073 


1405 


3191 


4977 


6763 


784CIP2B_1096 


10112 


1406 


3192 


4978 


! 6764 


784CIP2B_1097 


10117 


1407 


3193 


4979 


6765 


784CIP2B_1098 


10132 


1408 


3194 


4980 


6766 


784CIP2B_1099 


10169 


1409 


3195 


4981 


6767 


784CIP2B_1100 


10217 


1410 


3196 


4982 


6768 


784CIP2BJL101 


10226 [ 


1411 


3197 


4983 


6769 


784CIP2B_1102 


10232 


1412 


3198 


4984 


6770 


7S4CIP2BJL103 


10237 


1413 


3199 


4985 


6771 


784CIP2BJL104 


10279 


1414 


3200 


4986 


6772 


784CIP2C_1 


33 


1415 


3201 


4987 


6773 


784CIP2C_2 


271 


1416 


3202 


4988 


6774 


7 84CIP2C_3 


848 


1417 


3203 


4989 


6775 


784CIP2C 4 


849 


1418 


3204 


4990 


6776 


784CIP2C_5 


864 


1419 


3205 


4991 


6777 


784CIP2C_6 


953 


1420 


3206 


4992 


6778 


784CIP2C_7 


980 


1421 


3207 


4993 


6779 


784CIP2C_B 


1595 


1422 


3208 


4994 


6780 


784CIP2C_9 


1697 


1423 


3209 


4995 


6781 


784CIP2C_10 


| 1744 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priori ty 




of full- 


NO: of 


of contig 


NO: 


docket number 


NO ; in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




* 1424 


3210 


4996 


6782 


784CIP2C_11 


1937 


1425 


3211 


4997 


6783 


784CIP2C_12 


1955 


1426 


3212 


4998 


6784 


784CTP2C_13 


1955 


1427 


3213 


4999 


6785 


784CIP2C 14 


2185 


1428 


3214 


5000 


6786 


784CIP2C 15 


2889 


1429 


3215 


5001 


6787 


784CIP2C 16 


2901 


1430 


3216 


5002 


6788 


784CIP2C 17 


2902 


1431 


3217 


5003 


6789 


784CIP2C 18 


2905 


1432 


3218 


5004 


6790 


784CIP2C 19 


2948 


1433 


3219 


5005 


6791 


784CIP2C_20 


2956 


1434 


3220 


5006 


6792 


784CIP2C 21 


2959 


1435 


3221 


5007 


6793 


784£IP2C_22 


2965 


1436 


3222 


5008 


6794 


784CIP2C 23 


2966 


1437 


3223 


5009 


6795 


784CIP2C_24 


2970 


1438 


3224 


5010 


6796 


784CIP2C_25 


2985 


1439 


3225 


5011 


6797 


784CTP2C_26 


2987 


1440 


3226 


5012 


6798 


784CIP2C_27 


2993 


1441 


3227 


5013 


6799 


7B4CIP2C_28 


2993 


1442 


3228 


5014 


6800 


784CIP2C 29 


3017 


1443 


3229 


5015 


6801 


784CIP2C_30 


3046 


1444 


3230 


5016 


6802 


784CIP2C_31 


3050 


1445 


3231 


5017 


6803 


784CIP2C_32 


3357 


1446 


3232 


5018 


6804 


784CIP2C_33 


3359 


1447 


3233 


5019 


6805 


784CIP2C_34 


3432 


1448 


3234 


5020 


6-806" 


784CTP2C_35 


3438 


1449 


3235 


5021 


6807 


784CIP2C 36 


3439 


1450 


3236 


5022 


6808 


784CIP2C 39 


3463 


1451 


3237 


5023 


6809 


784CIP2C 40 


3466 


1452 


3238 


5024 


6810 


784CIP2C 41 


3466 


1453 


3239 


5025 


6B11 


764CIP2C 42 


3467 


1454 


3240 


5026 


6812 


784CIP2C_43 


3468 


1455 


3241 


5027 


6813 


784CIP2C_44 


3483 


1456 


3242 


5028 


6814 


784CIP2C_45 


3484 


1457 


3243 


5029 


6815 


784CIP2C_46 


3488 


1458 


3244 


5030 


6816 


784CIP2C_47 


3491 


1459 


3245 


5031 


6817 


784CIP2C_48 


3493 


1460 


3246 


5032 


6818 


7B4CIP2C_49 


3494 


1461 


3247 


5033 


6819 


784CIP2C_50 


3495 


1462 


3248 


5034 


6820 


784CIP2C 51 


3496 


1463 


3249 


5035 


6821 


784CIP2C_52 


3503 


1464 


3250 


5036 


6822 


784CIP2C_53 


3503 


1465 


3251 


5037 


6823 


784CIP2C_54 


3504 


1466 


3252 


5038 


6824 


784CIP2C 55 


3 511 


1467 


3253 


5039 


6825 


784CIP2C_56 


3531 


1468 


3254 


5040 


6826 


7S4CIP2C 57 


3536 


1469 


3255 


5041 


6827 


784CIP2C 58" 


3S4S - 


1470 


3256 


5042 


6828 


784CIP2C 59 


3548 


1471 


3257 


5043 


6829 


784CIP2C_60 


3551 


1472 


3258 


5044 


6830 


7B4CIP2C 61 


3553 


1473 


3259 


5045 


6831 


784CIP2C 62 


356-4 


1474 


3260 


5046 


6832 


784CIP2C 63 


3567 


1475 


3261 


5047 


6833 


784CIP2C_64 


3572 


1476 


3262 


5048 


6834 


784CIP2C_65 


3573 


1477 


3263 


5Q49 


6B35 


784CIP2C_66 


3574 


1478 


3264 


5050 


6836 


784CIP2C 67 


3583 


1479 


3265 


5051 


6837 


784CIP2C_68 


3615 


1480 


3266 


5052 


6838 


784CIP2C_69 


3623 


1481 


3267 


5053 


6839 


784CIP2C 70 


36"29 


1482 


3268 


5054 


6840 


784CIP2C 71 


3666 


1483 


3269 


5055 


6841 


784CIP2C_72 


3667 


1484 


3270 


5056 


6842 


784CrP2C 73 


3906 


1485 


3271 


5057 


6843 


784CIP2C 74 


3912 
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qfo in NO - 


SEQ ID 


ACiU XL/ DtVJ . 


ocy JLU 


Priority 


SEQ ID 


of full- 


NO : of 


f con tin 


NO : 


docket number 


NO : in 


length 


full- 


nucleotide 


frF rnnt 1 i n 




U . S . S . N . 


nucleotide 


length 


sequence 


peptide 


SEO ID NO* in 


r>9 / A ft ft 


sequence 


peptide 




sequence 


priority 






sequence 






application 




1486 


3272 


5058 


6844 


784CIP2C_75 


3924 


1487 


~~ 3 273 


5059 


6845 


784CIP2C 76 


3928 


1488 


3274 


5060 


6846 


784CIP2C 77 


3935 


1489 


3275 


5061 


6847 


784CIP2C 78 


3959 


1490 


3276 


5062 


6648 


784CIP2C 79 


3981 


1491 


3277 


5063" " 


6849 


784CIP2C 80 


3989 


1492 


3278 


5064 


6850 


784CIP2C 81 


4295 


" 1493 


3279 


5065 


6851 


784CIP2C 82 


4300 


1494 


3280 


5066 


6852 


784CIP2C 83 


4360 


1495 


3281 


5067 


6853 


7B4CIP2£ 84 


4362 


1496 


3282 


5068 


6854 


784CIP2C 85 


4371 


■ 1497 


3283 


5069 


6855 


784CIP2C 86 


4373 


1498 


3284 


5070 


6856 


784CIP2C 87 


4376 


1499 


3285 


5071 


6857 


784CIP2C 89 


437 8 


1500 


3286 


5072 


6858 


784CIP2C 90 


4382 


1501 


3287 


5073 


6859 


784CIP2C 91 


4409 


1502 


3288 


5074 


6860 


784CIP2C 92 


4421 


1503 


3289 


5075 


6861 




4421 


1504 


3290 


5076 


6862 


784CIP2C 94 




1505 


3291 


5077 


6863 


784CIP2C 9^ 


a a i n 

H H J U 


1506 


3292 


5078 


6864 


7fl4CIP?C 96 




1507 


3293 


5079 


6865 


784CIP7C 97 


4436 


1508 


3294 


5080 


6866 


784CIP2C 98 


A At Q 


1509 


3295 


5081 


6867 


784CIP2C 99 


444 0 


1510 


3296 


5082 


6868 


784CIP2C 100 


4441 


1511 


3297 


5083 


6869 


784CIP2C 101 


444 2 


1512 


3298 ™ 


5084 


6870 


784CIP2C 102 


4455 


1513 


3299 


5085 


6871 


784CIP2C 103 


4462 


1514 


3300 


5086 


6B72 


784CIP2C 104 


4466 


1515 


3301 


5087 


6873 


784CIP2C 105 


4469 


1516 


3302 


5088 


6874 


784CIP2C 106 


4477 


1517 


3303 


5089 


6875 


784CIP2C 107 


4481 


1518 


3304 


5090 


6876 


784CIP2C 108 


4483 


1519 


3305 


5091 


6877 


784CIP2C 109 


4484 


1520 


3306 


5092 


6878 


784CIP2C 110 


4486 


1521 


3307 


5093 


6879 


784CIP2C 111 


4490 


1522 


3308 


5094 


6880 


784CIP2C 112 


4499 


1523 


3309 


5095 


6881 


784CIP2C 113 


4503 


1524 


3310 


5096 


6882 


784CIP2C 114 


4506 


1525 


3311 


5097 


6883 


784CIP2C 115 


4509 


1526 ; 


3312 


5098 


6884 


784CIP2C 116 


4514 


1527 | 


3313 


5099 


68B5 


784CIP2C 117 


4516 


1528 


3314 


5100 


6886 


784CIP2C 118 


4522 


1529 


3315 


5101 


6887 


784CIP2C 119 


4525 


1530 


3316 


5102 


6888 


784CIP2C 120 


4527 


1531 


3317 


5103 


6889 


784CIP2C 121 


4528 


1532 


3318 


5104 


6890 


784CIP2C 122 


4529 


1533 


3319 


510'5 


6891 


784CIP2C 123 


4532 


1534 


3320 


5106 


6892 


784CIP2C 124 


4537 


1535 


3321 


5107 


6893 


784CIP2C 125 


4538 


1536 


3322 


5108 


6894 


784CIP2C 126 


4551 


1537 


3323 


5109 


6895 


784CIP2C_127 


4552 


1538 


3324 


5110 


6896 


784CIP2C 128 


4559 


1539 


3325 


5111 


6897 


784CTP2C_129 


4567 


1540 


3326 


5112 


6898 


784CIP2C_130 


4568 


1541 


3327 


5113 


6899 


784CIP2C_132 


4585 


1542 


3328 


5114 


6900 


784CIP2C 133 


4592 


1543 


3329 


5115 


6901 


784CIP2C_134 


4609 


1544 


3330 


5116 


6902 


784CIP2C 135 


4616 


1545 


3331 


5117 


6903 


784CtP2£ 136* 


4617 


1546 


3332 


5118 


6904 


784CIP2C 137 


4618 


1547 


3333 


5119 


6905 


784CIP2C 138 


4620 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corre sponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488, 725 


1548 


3334 


5120 


6906 


784CIP2C_139 


4624 


1549 


3335 


5121 


6907 


784CIP2C_140 


4632 


1550 


3336 


5122 


6906 


784CIP2C_141 


4634 


1551 


3337 


5123 


6909 


784CIP2C_142 


4638 


1552 


3338 


5124 


6910 


784CIP2C_143 


4639 


1553 


3339 


5125 


6911 


784CIP2C_144 


4643 


1554 


3340 ■ 


5126 


6912 


784CIP2C_145 


4644 


1555 


3341 


5127 


6913 


784CIP2C_146 


4655 


1556 


3342 


5128 


6914 


784CIP2C_147 


4668 


1557 


3343 


5129 


6915 


784CIP2C_148 


4677 


■ 1558 


3344 


5130 


6916 


784CIP2C_149 


4677 


1559 | 


3345 


. 5131 


6917 


784CIP2C_150 


4677 


1560 


3346 


5132 


6918 


784CIP2C_152 


4682 


1561 


3347 


5133 


6919 


784CIP2C_153 


4690 


1562 


3348 


5134 


6920 


784CIP2C 154 


4691 


1563 


3349 


5135 


6921 


784CIP2C_155 


4727 


1564 


3350 


5136 


6922 


784CIP2C_156 


4730 


1565 


3351 


5137 


6923 


784CIP2CJL57 


4734 


1566 


3352 


5138 


6924 


784CIP2C_158 


4757 


1567 


3353 


5139 


6925 


784CIP2CJL59 


4764 


1568 


3354 


5140 


6926 


784CIP2C_160 


4786 ! 


1569 


3355 


5141 


6927 


784CIP2C_161 


4793 


1570 


3356 


5142 


6928 


784CIP2C_162 


4825 


1571 


3357 


5143 


6929 


784CIP2CJ.63 


4826 


1572 


3358 


5144 


6930 


7B4CIP2C_164 


4850 


1573 


3359 


5145 


6931 


784CIP2C_165 


4853 


1574 


3360 


5146 


6932 


784CIP2C 166 


4855 


1575 


3361 j 


5147 


6933 


784CIP2C 167 


4856 


1576 


3362 


5148 


6934 


784CIP2C_168 


4867 


1577 


3363 


5149 


6935 


784CIP2C_169 


4869 


1578 


3364 


5150 


6936 


784CIP2CJ.70 


4878 


1579 


■ 3365 


5151 


6937 


784CIP2C_171 


4880 


1580 


3366 


5152 


6938 


784CIP2C_172 


4942 


1581 


3367 


5153 


6939 


784CIP2C 173 


4945 


1582 


3368 


5154 


6940 


784CIP2C_174 


4950 


1583 


3369 


5155 


•6941 


784CIP2CJL75 


4952 


1584 


- 3370 


5156 


6942 


784CIP2C 176 


4954 


1585 


3371 


5157 


6943 


784CIP2C_177 


4958 


1586 


3372 


5158 


6944 


784CIP2C_178 


4961 


1587 


3373 


5159 


6945 


784CIP2C_179 


5590 


1588 


3374 


5160 


6946 


784CIP2C_180 


5599 


! 1589 


3375 


5161 


6947 


784CIP2C_181 


5692 


1590 


3376 


5162 


6948 


784CIP2C_182 


5732 


1591 


3377 


5163 


6949 


7B4CIP2C_183 


5765 


1592 


3378 


5164 


6950 


784CIP2C_184 


5771 


1593 


3379 


5165 


6951 


784CIP2C_185 


5774 


1594 


3380 


5166 


6952 


784CIP2C_1B6 


5793 


1595 


3381 


5167 


6953 


784CIP2C_187 


5806 


1596 


3382 


5168 


6954 


784CIP2C 188 


5852 


1597 


3383 


5169 


6955 


784CIP2C_189 


5892 


1598 


3384 


5170 


6956 


784CIP2C_190 


6057 


1599 


3385 


5171 


6957 


7B4CIP2C_iy 1 




1600 


3386 


5172 


6958 


784CIP2C_192 


6109 


1601 


3387 


5173 


6959 


784CIP2C_193 


6160 


1602 


3388 


5174 


6960 


784CIP2C_194 


6297 


1603 


3389 


5175 


6961 


784CIP2C_195 


6398 


1604 


3390 


5176 


6962 


784CIP2C 196 


6398 


1605 


3391 


5177 


6963 


784CIP2C_197 


6415 


1606 


3392 


5178 


6964 


784CIP2CJ.98 


6448 


1607 


3393 


5179 


6965 


784CIP2C_199 


6469 


1608 


3394 


5180 


6966 " 


784CIP2C 200 


6476 


1609 


" 3395 


5181 


6967 


784CIP2C 201 


6561 
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SEQ XD NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: Of 


of contig 


NO: 


docket mimber_ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S.S .N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


sequence 


peptide 
sequence 




sequence 


priority- 
application 




1610 


3396 


5182 


6968 


784CIP2C_202 


6574 


1611 


3397 


5183 


6969 


784CIP2C 203 


6578 


1612 


3398 


5184 


6970 


784C1P2C 204 


6662 


1613 


3399 


5185 


6971 


784CIP2C__2 05 


6672 


1614 


3400 


5186 


6972 


7B4CIP2C_2 0o 


6691 


1615 


3401 


5187 


6973 


784CIP2C_207 


6695 


1616 


3402 


5188 


6974 


784CIP2C_208 


6746 


1617 


3403 


5189 


6975 


784CIP2C_209 


6898 


1618 


3404 


5190 


6976 


784CIP2C__210 


6938 


1619 


3405 


5191 


6977 


784CIP2C_211 


6943 


1620 


3406 


5192 


6978 


784CIP2C_2l2 


7110 


1621 


3407 


5193 


6979 


784CIP2C_213 


7200 


1622 


3408 


5194 


6980 


784CIP2C_214 


7212 


1623 


3409 


5195 


6981 


784CIP2C_215 


7218 


1624 


3410 


5196 


6982 


784CIP2C 216 


7249 


1625 


3411 


5197 


6983 


784CIP2C_217 


7500 


1626 


3412 


5198 


6984 


784CIP2C_218 


7509 


1627 


3413 


5199 


6985 


784CIP2C_219 


7523 


1628 


3414 


5200 


6986 


784CIP2C_220 


7544 ; 


1629 


3415 


5201 


6987 


784CIP2C 221 


7564 


1630 


3416 


5202 


6988 


784CIP2C_222 


7568 


1631 


3417 


5203 


6989 


784CIP2C_223 


7631 


1632 


3418 


5204 


6990 


784CIP2C 224 


7813 


1633 


3419 


5205 


6991 


784CIP2C 225 


7831 


1634 


3420 


5206 


6992 


784CIP2C_226 


7843 


1635 


3421 


5207 


6993 


784CIP2C_227 


7907 


1636 


3422 


5208 


6994 


784CIP2C_228 


7943 


1637 


3423 


5209 


6995 


784CIP2C_229 


8175 


1638 


3424 


5210 


6996 


784CIP2C_230 


8216 


1639 


3425 


5211 


6997 


784CIP2C_231 


8225 


1640 


3426 


5212 


6998 


784CIP2C_232 


8271 


1641 


3427 


5213 


6999 


784CIP2C_233 


8397 


1642 


3428 


5214 


7000 


784CIP2C 234 


8466 


1643 


3429 


5215 


7001 


784CIP2C_235 


8503 


1644 


3430 


5216 


7002 


784CIP2C_236 


8953 


1645 


3431 


5217 


7003 


784CIP2C_237 


9106 


1646 


3432 


5218 


7004 


784CIP2C_238 


9139 


1647 


3433 


5219 


7005 


784CIP2C_239 


9555 


1648 


3434 


5220 


7006 


784CIP2C_240 


9650 


1649 


3435 


5221 


7007 


784CIP2C_241 


9889 


1650 


3436 


5222 


7008 


784CIP2C_242 


9933 


1651 


3437 


5223 


7009 


784CIP2C 243 


9953 


1652 


3438 


5224 


7010 


784CIP2C_244 


9981 


1 1653 


3439 


5225 


7011 


784CIP2D_1 


746 


1654 


3440 


5226 


7012 


784CIP2D_2 


3558 


1655 


3441 


5227 


7013 


784CIP2D 3 


3558 


1656 


3442 


5228 


7014 


784CIP2D_4 


3633 


i 1657 


3443 


5229 


7015 


784CIP2D_5 


3 658 


1658 


3444 


5230 


7016 


784CIP2D_6 


3732 


1659 


3445 


5231 


7017 


784CIP2D_7 


4 004 


i 1660 


3446 


5232 


7018 


784CIP2D_8 


4700 


| 1661 


344 7 


5233 


7019 




4 703 


1662 


3448 


5234 


7020 


784CIP2DJL0 


4774 


1663 


3449 


5235 


7021 


7 84CIP2D_11 


4894 


1664 


3450 


. 5236 


7022 


784CIP2D_12 


4918 


1665 


3451 


5237 


7023 


784CIP2D 13 


5159 


1666 


3452 


5238 


7024 


784CIP2DJL4 


7443 


1667 


3453 


5239 


7025 


784CIP2D 15 


8673 


1668 


3454 


5240 


7026 


784CIP2D_16 


8679 


1669 


3455 


5241 


7027 


784CIP2D_17 


8727 


1670 


3456 


5242 


7028 


784CIP2D_18 


8734 


1671 


3457 


5243 


7029 


784CIP2D_19 


8756 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 




SEQ ID 


of full- 


NO: of 


of contig 


NO : 


docket number* 




length 


full- 


nucleotide 


of contig 


corresponding 


U.S .S .N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




1672 


3458 


5244 


7030 


784CIP2D 20 


8818 


1673 


3459 


5245 


7031 


784CIP2D_21 


6844 


1674 


3460 


5246 


7032 


784CIP2D_22 


8846 


1675 


3461 


5247 


7033 


784CIP2D_23 


8912 


1676 


3462 


5248 


7034 


784CIP2DJ24 


8918 


1677 


3463 


5249 


7035 


784CIP2D_25 


8918 


1678 


3464 


5250 


7036 


784CIP2D 26 


8941 


1679 


3465 


5251 


7037 


784CIP2D_27 


6941 


1680 


3466 


5252 


7038 


7 84CIP2D_28 


8951 


1681 


3467 


5253 


7039 


784CIP2D_29 


8951 


1682 


3468 


5254 


7040 


784CIP2D 30 


9007 


1683 


3469 


5255 


7041 


784CIP2D 31 


9012 


1684 


3470 


5256 


7042 


784CIP2D 32 


9013 


1685 


3471 


5257 


7043 


784CIP2D 33 


9025 


1686 


3472 


5258 


7044 


784CIP2D 34 


9053 


1687 


3473 


5259 


7045 


784CIP2D 35 


9054 


1688 


3474 


5260 


7046" 


784CIP2D 36 


9054 


1689 


3475 


5261 


7047 


784CIP2D 37 


9113 


1690 


3476 


5262 


7048 


784CIP2D 38 


9134 


1691 


3477 


5263 


7049 


784CIP2D 39 


9152"'' ~ 


1692 


3478 


5264 


7050 


784CIP2D 40 


9152 


1693 


3479 


5265 


7051 


784CIP2D 41 


9211 


1694 


3480 


5266 


7052 


784CIP2D 42 


9223 


1695 


3481 


5267 


7053 


784CIP2D_43 


9223 


1696 


3482 


5268 


7054 


784CIP2D 44 


9231 


1697 


3483 


5269 


7055 


784CIP2D 45 


9236 


1698 


3484 


5270 


7056 


784CIP2D 46 


9236 


1699 


3485 


5271 


7057 


784CIP2D 47 


9303 


1700 


3486 


5272 


7058 


784CIP2D 48 


9309 


1701 


3487 


5273 


7059 


784CIP2D 49 


9314 


1702 


3488 


5274 


7060 


784CIP2D 50 


9326 


1703 


3489 


5275 


7061 


r 784CIP2D 51 


9339 


1704 


3490 


5276 


7062 


784CIP2D 52 


9348 


1705 


3491 


5277 


7063 


7 84CIP2D_53 


9376 


1706 


3492 ■ 


5278 


7064 


784CIP2D 54 


9382 


1707 


3 493 


5279 


7065 


784CIP2D 55 


9407 


1708 


3494 


5280 


7066 


784CIP2D_56 


9414 


1709 


3495 


5281 


7067 


784CIP2D 57 


9439 


1710 


349<J 


5282 


7068 


784CIP2D 58 


9485 


1711 


3497 


5283 


7069 


784CIP2D 59 


9493 


1712 


3498 


5284 


7070 


784CIP2D_60 


9501 


1713 


3499 


5285 


7071 


784CIP2D_61 


9526 


1714 


3500 


5286 


7072 


784CIP2D 62 


9526 


1715 


3501 


5287 


7073 


7B4CIP2D_63 


9551 


1716 


3502 


5288 


7074 


784CIP2D 64 


9557 


1717 " 


3503 


5289 


7075 


784CIP2D 65 


9568 


1718 


3504 


5290 


7076 


784CIP2D 66 


9588 


1719 


3505 


5291 


7077 


784CIP2D_67 


9597 j 


1720 


3506 


5292 


7078 


784CIP2D_68 


9615 


1721 


3507 


5293 


7079 


784CIP2D_69 


9628 


1722 


3508 


5294 


7080 


784CIP2DJ70 


9649 


1723 


3509 


5295 


7081 


784CIP2D_71 


9652 


1724 


3510 


5296 


7082 


784CIP2D_72 


9660 


1725 


. 3511 


5297 


70B3 


784CIP2D_73 


9662 


1726 


3512 


5298 


" 7084 


784CIP2D 74 


9725 


1727 


3513 


5299 


7085 


784CIP2D_75 


9746 


1728 


3514 


5300 


7086 


784CIP2D_76 


9777 


1729 


3515 


5301 


7087 


784CIP2D_77 


9787 


1730 


3516 


5302 


7088 


784CIP2D 78 


9790 


1731 


3517 


5303 


7089 


784CIP2D_79 


9842 


1732 


3518 


5304 


7090 


784CIP2D_80 


9842 


1733 


3519 


5305 


7091 


784CIP2D_81 


9848 ™| 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: Of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number _ 
corresponding 
SEQ ID NO: in 
priority- 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1734 


3520 


5306 


7092 


784CIP2D_82 


9867 


1735 


3521 


5307 


7093 


784CIP2D 83 


, 10010 


1736 


3522 


5308 


7094 


734CIP2D_84 


10011 


1737 


3523 


5309 


7095 


7B4CIP2D_85 


10052 


1738 


3524 


5310 


7096 


784CIP2D 86 


10057 


1739 


3525 


5311 


7097 


784CIP2D 87 


10085 


1740 


3526 


5312 


7098 


784CIP2D 89 


10139 


1741 


3527 


5313 


7099 


784CIP2D_90 


10142 


1742 


3528 


5314 


7100 


784CIP2D_92 


10165 


1743 


3529 


5315 


7101 


784CIP2D 93 


10173 


1744 


3530 


5316 


7102 


784CIP2D 94 


10173 


1745 


3531 


5317 


7103 


784CIP2D 95 


10273 


1746 


3532 


5318 


7104 


784CIP2E 1 


3121 


1747 


3533 


5319 


7105 


784CIP2E_2 


3628 


1748 


3534 


5320 


7106 


784CIP2E 4 


3673 


1749 


3535 


5321 


7107 


784CIP2E_5 


4018 


1750 


3536 


5322 


7108 


784CIP2E_6 


4467 


1751 


3537 


5323 


7109 


784CIP2E 7 


4865 


1752 


3538 


5324 


7110 


784CIP2E_8 


4916 


1753 


3539 


5325 


7111 


784CIP2E_9 


4923 


1754 


3540 


5326 


7112 


784CIP2E_10 


4926 


1755 


3541 


5327 


7113 


784CIP2E 11 


4962 


1756 


3542 


5328 


7114 


784CIP2E_12 


4963 


1757 


3543 


5329 


7115 


784CIP2E 13 


4964 


1758 


3544 


533C 


7116 


784CIP2E_14 


4988 


1759 


3545 


5331 


7117 


784CIP2E 15 


5835 


1760 


3546 


5332 


7118 


784CIP2E_16 


7682 


1761 


3547 


5333 


7119 


784CIP2E 17 


7682 


1762 


3548 


5334 


7120 


784CIP2E 18 


7699 


1763 


3549 


5335 


7121 ■' 


784CIP2E 19 


7707 


1764 


" 3550 


5336" 


7122 


784CIP2E 20 


7707 


1765 


3551 


5337 


7123 


784CIP2E_21 


7752 


1766 


3552 


5338 


7124 


784CIP2E_22 


8357 


1767 


3553 


5339 


7125 


784CIP2E_23 


9065 


1768 


3554 


5340 


7126 


784CIP2E 24 


9324 


1769 


3555 


5341 


7127 


784CIP2F_1 


2976 


1770 


3556 


5342 


7128 


784CIP2F 2 


3559 


1771 


3557 


5343 


7129 


784CIP2F_3 


4021 


1772 


3558 


5344 


7130 


7B4CIP2F_4 


4474 


1773 


3559 


5345 


7131 


7 84CIP2F_5 


4566 


1774 


3560 


5346 


7132 


7 84CIP2F_6 


4705 


1775 


3561 


5347 


7133 


784CIP2F 7 


4707 


1776 


3562 


5348 


7134 


784CIP2F 8 


4712 


1777 


3563 


5349 


7135 


784CIP2F_9 


5006 


1778 


3564 


5350 


7136 


7B4CIP2F 10 


5009 


1779 


3565 


5351 


7137 


784CIP2F 11 


" 501* 


1780 


3566 


5352 


?13B 


?84CIP2F 12 


5015 


1781 


356"7 


5353 


7139 


784CIP2F_13 


7724 


1782 


3568 


5354 


7140 


784CIP2F_14 


7725 


1783 


3569 


5355 


7141 


784CIP2F 15 


8828 


1784 


3570 


5356 


7142 


784CIP2F 16 1 


8830 


1785 


3571 


5357 


7143 


784CIP2F_17 


9739 


1786" 


3572 


5358 


7144 


784CIP2F_18 


9896 
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TABLE 7 



SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

c o rr e sp ondi ng 

to firat 

amino acid ' 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 

amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C*Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G-Glycine, 
H=Histidine, Idsoleucine , K-Lysine, 
I.sT.piipitip ■ MsMpt*hi nninp N=Amnai*acT"i np 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V- Valine, 
W=Tryptophan, Y=Tyrosine, X=CJnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=pos8ible nucleotide insertion) 


5359 


337 


1131 


AHL S ARLS AL I LDE VA I L P APQNLS VLS TNM KHLLMWS P V I APG 
ETVYYSVEYQGEYESLYTSHIWIPSSWCSLTEGPECDVTDDITA 
TVPYNLRVRATLGSOTS/CLEHP/VSIPLIETQPSLPDL/RMEI 

TVTV TUT \TT T7T TTT*>T J"*D/"MnPTTT TnVUDO DDf ^T?trtr\7WTirD Cfl/ZT 13 

lISlAjr nijV xauCiULKsryjr Ctr JjVHI WKKCilrvsAllillilV ftfj VrtoWjXlr 

VHLETMEPGAAYCVKAQTFVKAIGRYSAFSQTECVEVOGEAIPL 
VLALFAF VG FML I LVW P L FVWKMGRLLQ / YLLL PRGGS S QTP W 
KITQF 


5360 


2 


1115 


PR VRS SGGQED P ASl^QWARPRFTQ PS KMRRRVI ARP VGS S VRLK 
CVASGHPRPDITWMKDDQALTRPEAAEPRKKKWrLSLKNLRPED 
SGKYTCRVSNRAGAINATYKVDVIQRTRSKPVLTGTHPVNTTVD 
FGGTT S FQC KVKS DVK.P V I QWuKRVE i GAEvRnWSTXDVGGUKr 
WLPTGDVWSRPDGS YUJKLLITRARQDDAGMY I CLGANTMGYS 
FRSAFLTVLPDPKPPGPPVASSSSATSLPWPWIGXPAGAVFIL 
GTLLLWLCQAQKKPCTPAPAPPLPGHRPPGTARDRSGDKDLPSL 
AALSAG PGVGLCEEHGS PAAPQHLLG PG PVAG PKLYPKL YTGHS 
TPHTYTHPPPSCQLNSSHS 


5361 


3 


925 


HEGS I SSANILLDDQFQPKLTDFAMAHFRSHLEHQSCTINMTSS 
SSKHLWYMPEEY IRQGKLS IKTDVYSFG I V IMEVLTGCRWLDD 
PKHIQLRDLLRELMEKRGLDSCLSFLDKKVPPCPRNFSAKLFCL 
AGRCAATRAKLRPSMDEVLNTLESTQASLYFAEDPPTSLKSFRC 
PSPLFLENVPS I PVEDDESQNNNLLPSDEGLRIDRMTQKTPFEC 
SQSEVMFLSLDKKPESKRNEEACNMPSSSCEESWFPKYIVPSQD 
I»RP YKVN I DP S S E APGHS CRS RP VES S CSSKFSWDEYEQYKKE 


5362 


2 


4879 


SCQVEGCTRTYNSSQSIGKHMKTAHPDQYAAFKMQRKSKKGQKA 
NNLNTPNNGKFVYFLPS PVNS SNPFFTSQTKANGNPACSAQLQH 
VSPPIFPAHLASVSTPLLSSMESVINPNITSQDKNEQGGMLCSQ 
MENLPSTALPAQMEDLTKTVLPLNIDRGSDPFLSLPAESSSIDL 
FPS PADSGTNSVFSQLENNTNHYSSQIEGNTNS S FLKGGNGENA 
VFPSQVNVANNFSSTNAQQSAPEKVKKDRGRGQTGKERKPKHNK 
RAKWPAI I RDGKFI CSRCYRAFTNPRS LGGHLS KRS YCKPLDGA 
E I AQ ELLQ SNGQ PS LLAS M I LS TNAVNLQQPQQS TFNPEACFKD 
PSFlrQLLAENRSPAFLPNTFPRSGVTNFKTSVSQEGSEI IIQAL 
ETAG I PSTFEGAEMLSHVSTGCVSDASQVNATVMPNPTVPPLLH 
TVCHPNTIiliTNQWRTSNS KTSS IEECSSLPVFPTNDLLLKTVEN 
GLCSSSFPNSGGPSQNFTSNSSRVSVISGPQNTRSSHLNKKGNS 
ASKRRKKVAPPLIAPNASQNLVTSDLTTMGLIAKSVEIPTTNLH 
SNVIPTCEPQS LVENLTQKLNNVNNQLFMTD VKEN FKTS LE S HT 
VLAPLTLKTENGDS QMMALNS CTTSVNS DLQI SBDNVIQNFEKT 
LEIIKTAMNSQILEVKSGSQGAGETSQNAQINYNIQLPSVNTVQ 
NNKLPDSSP\FSSFISVMPTESNIPQSE\VSHKEDQIQEILEGL 
QKLKLENDLS TPAS QC VL INTS VTLTPT P VKSTAD I TV I QP VS E 
M I N IQ FNDKVNKPF V CQNQG CNYS AMTKDAL FKHYGK I HQ YT P E 
M I LE I KKNQLKFAP F KCWPTCTKTFTRNSNLRAHCQLVHKFTT 
EEM\/KLKIKRPYGRKSQSEN7PASRSTQVKKQLAMTEENKKESQ 
PALELRAETQNTHSNVAVI PEKQLIEKKS PDFCTESSLQ VITVTS 
EQCNTNALTNTQTKGRKIRRHKKEKEEKKRKKPVSQSLEFPTRY 
<5PYPPYP rVHnnrPAAFTTOONT^T.HYOAVHTC^DLPAFSAEVFF 
ESEAGKESEETETKQTLKEFRCQVSDCSRIFQAITGLIQHYMKL 
HEMTPEE IESMTASVDVGKFPCDQLECKSS FTTYLNYWHLEAD 
HGIGLRASKTEEDGVYKCDCEGCDRIYATRSNLLRHIFNKHNDK 
HKAHLI RPRRLTPGQENMSS KANQEKS KS KHRGT KHSRCGKEG I 
KMPKTKRKKKNNLENKNAKIVQIEENKPYSLKRGKHVYSIKARN 
DALS ECTS RFVTQYP CMI KGCTS WTSESN 1 1 RH Y KCHKLS KAF 
TSQHRNLLIVFKRCCNSQVKETSEQEGAKNDVKDSDTCVSESND 
NSRTTAT VSQKE VE KNE * DEMDEIiTELFITKL INEDSTS VETQA 
NTSSWSNDFQEDNLCQSERQKASNLKRVNKEKNVSQNKKRKVE 
KAE PAS AAELSS VRKE EETAVAI QTI EEHP AS FDWS S FKPMG FE 
VSFLKFLEESAVKQKXNTDKDHPNTGNKKGSHSNSRKNIDKTAV 
TSGNHVCPCKESETFVQFANPSQLQCSDNVKIVLDKNLKDCTEL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A«Alanine, C-Cyeteine, D-Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine , G-Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unlenoum, *«Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








VLKQLQEMKPTVSLKKLBVHSNDPDMSVMKDISIGKATGRGQY 


5363 


8066 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP 
P PS WRRQPPGGIRRDFS RRLRREANLVATCLPVRAS LPHRLNML 
RGPGPGLLLLAVLCLGTAVPSTGASKSKRQAQQMVQPQSPVAVS 
QS KPGCYDNGKHYQ INQQWERTYLGNALVCTCYGGS RGFNCES K 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMLECVCLGNGKGEWT 
CKPIAEKCFDHAAGTSYWGETWEKPYQGWMMVDCTCIiGEGSGR 
ITCTSRNRCNDQDTRTSYRIGDTWSKKDNRGNbLQCICTGNGRG 
EWKCSRHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DS GWYS VGMQLA* KTQGNKQML \ CTCLGNGVS CQETAVTQT YG 
GNSNGEPCVLPFTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 
KYS FCTDHTVL VQTRGGNSNGALCHFPFL YNNHNYTDCTS EGRR 
DNMKWCGTTQNYDADQKFGFCPMAAHEE ICTTNEGVMYR I GDQW 
DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVN 
DTFHKRHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 
GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 
TETPSQPNSHPI QWNAP QPSHISKYI LRWR P KNS VGRWKE AT I P 
GHLNS YTIKGLKPGWYEGQLI S I QQ YGHQEVTRFDFTTTSTST 
P VTSNT\ VTGETTP F S PLVATSE S VTEITASS FWS WVSASDTV 
SGFRVEYELSEEGDEPQYLVLPSTATSV\NIP\DLLPGRKYIVN 
VYQISEDGEQSLILSTSQTTAPDAPPDPTVDQVDDTSIWRWSR 
PQAPITGYRIVYSPSVEGSSTELNLPETANSVTLSDLQPGVQYN 
I T I YAVEENQESTP W I QQETTGTP RSDT VPS PRDLQF VE VTDV 
KVT I MWTP PES AVTG YRVDVI PVNLPGEHGQRLPLSRNTF\ AEN 
TGLS PGVT YY FKVFAVSHGRE S KPLTAQQTT KL\DAPTNLQFVN 
ETDS T VLVRWTPPRAQ I TG YRLTVGLTRRGQ PRQYNVGPS VS KY 
PLRNLQPAS E YTVSLVA I KGNQE S P KATGVFTTLiQPGSS I PP YN 
T E VTETTI VI TWTPAPRI GFKLG VRP SQGGEAPREVTSDSGS I V 
VSGLTPGVE YVYTIQVLRDGQERDAP \ IVNK\WTPLSPPTNLH 
LEANPDTGVLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEVV 
HADQSSCTF\DNLEVPGLiEYNVSVYTVKDDKESVPISDTIIPAV 
P P PTDLR FTN/ 1 LGPDTMRVTW \ AP P PS IDLTNFLVRYS P VKNE 
GRMLQSLS I FFLSDN\AWLTNLLPGTEYWS VSSVYEQHESTP 
\LRGRQKTGLDSP\TGIDFS\DITA\NSFT\VHW\IAPRA/TPI 
TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITLTNLTPGTEYW 
SIVALiNGREESPLLIGQQSTVSDVPRDLEWAATPTSLLl\SWD 
APAVTVRYYRIT YGETGGNSPVQE FTVPGS KS TATISGLKPGVD 
YT ITVYAVTGRGDS PAS SKPISI NYRTE I DKPS QMQVTDVQDNS 
I S VKWLPS SS P VTGYRVTTT\ P KNGPG\PTKTKTAGPDQTEMTI 
EGLQPTVEYVVSVYAQNPSGESQPLVQTAVTN'IDRPKGLAFTDV 
DVDSIKIAWESPQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELQGLR PG S E YTVS WALHDDME S Q? L IGTQ S TAI PAPTDLKFT 
QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINLAPDSS 
SVWSGLMVATKYEVSVYALKDTLTSRPAQGVVTTLENVSPPRR 
ARVTDATETTITISWRTKTETITGFQVDAVPANGQTPIQRTIKP 
D VRS YTITGLQ PGTDYKI YLYTIiNDNARSS P WI DASTAIDAPS 
NLRFLATT PNS LLVS WQP PRAR I TG Y I IKYEKPGSP PREWPRP 
RPGVTEATITGLEPGTEYTIYVIALKNNQKSEPLIGRKKTDELP 
QLVTLPHPNLHGPEILDVPSTVQKTPFVTHPGYDTGNGIQLPGT 
SGQQP S VGQQM I FEEHGFRRTTPPTTATP IRHRPRP YP PNVGQE 
ALSQTT I S WAP FQDTSEYI ISCHP VGTDEE PLQ FRVPGTSTSAT 
LTGLTRGATYN 1 1 VEALKDQQRHKVREEWTVGNS VNEGLNQPT 
DDSCFDPYTVSHYAVGDEWERMSESGFKLLCQCLGFGSGHFRCD 
SSRWCHDNGVNYKIGEKWDRQGENGQMMSCTCLGNGKGEFKCDP 
HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 
RPGGEPS P EGTTGQS YNQYSQRYHQRTNTNVNCP I ECFMPLDVQ 
ADREDSRE 


5364 


8066 


703 


' RLCCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP 
P PSWRRQP PGGI RRDFSRRLRREANIiVATCLPVRASLPHRLNML 
RGPGPGLLLLAVLCLGTAVPSTGASKSKRQAQQMVQPQSPVAVS 
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SEQ 
ID 
NO: 


Predicted ; 
beginning 
nucleotide 
location 
corre 3pondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine , G^Glycine, 
H=Histidine, I=Isoleucine, IO=Lysine, 
L=Leucine, M»Methionine , N»Aeparagine , 
P-Proline, Q-Glutamine, R^Arginine, 
S-Serine, T=Threonine, V- Valine , 
W= Tryptophan, Y» Tyrosine, X*=Unknown, **Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








QSKPGCyDKGKHYQlNOQWERTYLGNALVCTCYGGSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMLECVCLGNGKGEKT 
CKPIAEKCFDHAAGTSYWGETWEKPYQGWMMVDCTCLGEGSGR 
I TCTSRNR CNDQDTRT S YR I GDTWS KKDNRGNLLQC I CTGNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DS G WYS VGMQLA* KTQGNKQM L \ CT CLGNG VS CQET AVTQT YG 
GNSNGEPCVLPFTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 
KYSFCTOHTVLVQTRGGNSNGALCHFPFLYNNHNYTDCTSEGRR 
DNMKWCGTTQNYDADQKFGFCPMAAHEEICTTNEGVMYRIGDQW 
DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVN 
DTFHKRHEEGHMLNCTCFGOGRGRWKCDPVDQCQDSETGTFYQI 
GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 
TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 
GHLNS YTI KGLKPGWYEGQLIS IQQYGHQEVTRFDFTTTSTST 
P VTSNT\VTGETTP FS PLVATSES VTE I TAS S FWSWVSASDTV 
SGFRVEYELSEEGDEPQYLVLP£TATSV\NIP\DLLPGRKYIVN 
VYQ I S EDGEQS L I LSTSQTTAPDAP PDPTVDQVDDTS I WRWSR 
PQAP I TGYR1 VYSPSVEGSSTELNLPETAKS VTLSDLQPGVQYN 
IT I YAVEENQESTPWI QQETTGTPRSDTVPSPRDLQF\ r EVTDV 
KVTIMWTP PES AVTG YRVDVI PVNLPGEHGQRLPLS RNTF \ AEN 
TGLSPGVTYYFXVFAVSHGRESKPLTAQQTTKL\DAPTNLQFVN 
ETD S TVLVRWT P P RAQ I TG YRLTVGLTRRGQ PRQ YNVG P S VS KY 
PLRNLQPAS E YTVSLVA I KGNQES P KATGVFTTLQPGS S I P P YN 
TEVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 
VSGLTPGVEYVYTIQVLRDGQERDAP\ IVNK\ WTPLS PPTNLH 
LEANPDTGVLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEW 
HADQS SCTF\ DNLEVPGLE YNVS VYT VKDDKBS VP I S DT 1 1 PAV 
PPPTDLRFTN/ILGPDTMRVTW\APPPSIDLTNFLVRYSPVKNE 
GRMLQSLS I FFLSDN\AWLTNLLPGTE YWS VSS VYEQHES TP 
\LRGRQKTGLDSP\TGIDFS\DITA\NSFT\VHW\IAPRA/TPI 
TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITLTNLTPGTEYW 
SIVALNGREESPLLIGQQSTVSDVPRDLEWAATPTSLLl\SWD 
APAVTVRYYR 1 TYGETGGNS P VQE FTVPGSKSTATISGLKPG VD 
YT I TVYAVTGRGDS PAS S KP IS INYRTE I DKPSQMQ VTDVQDNS 
ISVKWLPSSSPVTGYRVTTT\PKNGPG\PTKTKTAGPDQTEMTI 
EGLQPTVE YWS VYAQN PSGESQPLVQTAVTN I DRP KGLAFTDV 
DVDSIKIAWESPQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELQGLRPGSE YTVS WALHDDMESQPLI GTQSTAI PAPTDLKFT 
QVT PTS LS AQ WTP PNVQLTGYRVRVTPKE KTGPMKE I NLAPDS S 
SVWSGLMVATKYEVSVYALKDTLTSRPAG^VVTTLENVSPPRR 
ARVTDATETT IT I S WRTKTET I TGFQVDAVPANGQT P I QRTI KP 
DVRSYTITGLQPGTDYKIYLYTLNDNARSSPWIDASTAIDAPS 
NLRFIiATTPNSLLVSWQPPRARITGYIIKYEKPGSPPREWPRP 
RPGVTEATITGLEPGTEYTI YVIALKNNQKSEPLIGRKKTDELP 
QLVTLP H PNLHG PE I LD VP ST VQ KT P FVTHPGYDTGNG IQLPGT 
SGQQPSVGQOM I FEEHGFRRTTP PTTATP IRHRPRPYPPNVGQE 
ALSQTT I S WAP FQDTSEY 1 1 SCHP VGTDE E PLQFRVPGTSTSAT 
LTGLTRGATYNI I VEALKDQQRHKVREEWTVGNS VNEGLNQPT 
DDSCFDPYTVSHYAVGDEWERMSESGFKLLCQCIiGFGSGHFRCD 
SSRW CHDNGVNY KIGE KWDRQGENGQMMS CTCLGNGKGEFKCD P 
HEATCYDDGKTYHVGEQWQKEYLiGAICSCTCFGGQRGWRCDNCR 
R PGGE P S P EGTTGQS YNQ YS QR YHQRTNTNVNCP I E C FMPLD VQ 
ADREDSRE 


S365 




703 


RLCCTGGGEGT PGASGKRGPAATTSLVLC I PS VP P P VPFPTLW P 
PPSWRRQPPOG I RRDFSRRLRREANLVATCLPVRASL PHRLNML 
RGPGPGLLLLAVLCLGTAVPSTGASKSKRQAQOMVQPQSPVAVS 
Q S KPG CY DNGKHYQ I NQQWERT YLGNALVCTC YGGS RG FNCE S K 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTIANRCHEGGQSYKrGDTWRRPHETGGYMLECVCLGNGFCGEWT 
CKP IAEKCFDHAAGTS YWGETWE KP YQGWMMVDCTCLGEGSGR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F= Phenyl alanine, G*Glycine, 
H=Histidine, I*Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T*Threonine , V-Valine, 
W-Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








I TCTSRNRCNDQDTRTS YR IGDTWS KKDNRGNLLQCI CTGNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSGWYS VGMQLA* KTQGNKQML \ CTCLGNGVS CQETAVTQTYG 
GNSNGEPCVLPFTYKGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 
KYSFCTDHTVLVQTRGGNSNGALCHFPFLYNNHNYTDCTSEGRR 

dnmkwcgttqnydadqkfgfcpmaahee i cttnegvmyr igdqw 
dkqhdmghmmrctcvgngrgewtc i aysqlrdqci vddi tynvn 
dt fhkrheeghmlnctcfgqgrgrwkcdp vdqcqdsetgtf yq i 
gdswekyvhgvryqcycygrgigewhcqplqtypsssgpvevfi 
tetpsqpnshpiqwnapqpshiskyilrwrpknsvgrwkeatip 
ghlnsytikglkpgwyegqlis1qqyghqevtrfdftttstst 
pvtsnt\vtgettpfsplvatsesvteitassfvvswvsasdtv 
sgfrveyelseegdepqylvlpstatsv\nip\dllpgrkyivn 
vyq is edgeqs l i lstsqttapdap pdptvdqvddts i wrwsr 
pqap i tg yr i vys ps ve g ss te lnl p etans vtls dlqpg vqyn 
i ti yaveenqestp wiqqettgtprsdtvps prdlqfve vtdv 
kvt i m wtp p e s avtg yr vdvi p vnl pgehgqrl p lsrntf \ aen 
tglspgvtyyfkvfavshgreskpltaqqttkl\daptnlqfvn 
etdstvlvrwtppraq i tgyrltvgltrrgqprqynvgps vs ky 
plrnlq pas eytvslva i kgnqes pkatgvfttlqpgss i p p yn 
tevtettivitwtpaprigfklgvrpsqggeaprevtsdsgsiv 
vsgltpgveyvytiqvlrdgqerdap\ivnk\wtplspptnlh 
leanpdtgviitvswersttpditgyrltttptngqqgnsleew 
hadqs sct f \ dnlevpgle ynvs vytvkddkes vpisdti i pav 
p p p tdlrftn/ 1 lg pdtmr vtw\ ap p p s idltn flvrys p vkne 
grmlqs lsi ffls dn\awltnll pgte yws vs s v yeqhe s tp 
\lrgrqktgldsp\tgidfs\dita\nsft\vhw\iapra/tpi 
tgyrir\hhpehf\sgrpredr\vphsrnsitltnltpgteyw 
s i valngrees pll i gqqstvsdvprdlewaatptslli \ s wd 
apavtvryyritygetggnspvqeftvpgskstatisglkpgvd 
yt i t vya vtgrgds pas s kp i s in yrte i dk ps qmq vtdvqdns 
isvkwlpsss pvtgyrvttt \ pkngpg\ ptktktagpdqtemti 
eglqptveywsvyaqnpsgesqplvqtavtnidrpkglaftdv 

DVDSIKIAWESPQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELQGLRPGSEYTVSWALHDDMESQPLIGTQSTAIPAPTDLKFT 
QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINLAPDSS 
SVWSaLMVATKYEVSVYALKDTLTSRPAQGWTTLENVSPPRR 
ARVTDATE TT I T I S WRTKTE T I TG FQ VDAVPANGQTP I QRT I KP 
DVRSYTITGLQPGTDYKIYLYTLNDNARSSPWIDASTAIDAPS 
NLR FLATTPNS LLVS WQP PRAR I TG Y I IKYEKPGSPPREWPRP 
RPGVTEATITGLEPGTEYTIYVIALKNNQKSEPLIGRKKTDELP 
QLVTLPHPNLHGPEILDVPSTVQKTPFVTHPGYDTGNGIQLPGT 
SGQQPSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 
ALSQTTISWAPFQDTSEYIISCHPVGTDEEPLQFRVPGTSTSAT 
LTGLTRGAT YN 1 1 VEALKDQQRHKVREEWTVGNSVNEGLNQPT 
DDSCFDPYTVSHYAVGDEWERMSESGFKLLCQCLGFGSGHFRCD 
S S RW CHDNGVN Y K I GE KWDRQGENGQMMSCTCLGNGKG E FKC DP 
HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 
RPGGEPSPEGTTGQSYNQYSQRYHQRTNTNVNCPIECFMPLDVQ 
ADREDSRE 


5366 


8066 


703 


RLCCTGGG EGT PG ASG KRGPAATT S LVLCI PS VPPPVP FPTLWP 
P P S WRR Q PPGG I RRDFS RRLRREANLVATCL PVRAS LP HRLNM L 
RGPGPGLLLLAVLCLGTAVPSTGASKSKRQAQQMVQPQSPVAVS 
QSKPGCYDNGKHYQINQQWERTYLGNALVCTCYGGSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CT I ANR CHEGG QS Y KI GDTWRR PHETGG YMLE CVCLGNGKGE WT 
CKP I AEKCFDHAAG TS YWGETWE KP YOG WMMVDCTCLG EGSGR 
I TCTSRNRCNDQDTRTS YRIGDTWSKKIDNRGNLLQCI CTGNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSGWYS VGMQLA* KTQGNKQML\ CTCLGNGVS CQETAVTQTYG 
GNSNGE PCVLP FTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 
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SEQ 
ID 

.NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G«Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Aeparagine , 
P=Proline, Q=Glutamine, R«Arginine, 
S«»Serine, T-Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








KYS FCTDHT VL VQTRGGNSNGALCH FP FL YNNHK YTDCTSEGRR ' 

DNMKWCGTTQN YD ADQ KFG F C PMAAHE E I CTTNEG VM YR IGDQ W 

DKQHDMGHMMRCTCVGNGRGEWTCIAYSQliRDQCIVDDrTYNVN 

DT FHKRH E EGHM LNCT C FGQGRGRW KCDPVDQCQDS E TGTFYQ I 

GDS WE KYVHG VR YQCY C YGRG I G E WHCQ PLQT YPS S S GP VEVF I 

TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 

GHLNS YT I KGLKPG WYEGQL I S I QQ YGHQE VTRFD FTTT STS T 

PVTSNT\VTGETTPFSPLVATSESVTEITASSFWSWVSASDTV 

SGFRVEYELSEEGDEPQYLVLPSTATSV\NIP\DLLPGRKYIVN 

VYQISEDGEQSLI LS TS QTTAP DAP P DP T VDQVDDTS I WRWS R 

PQAPITGYRIVYSPSVEGSSTELNIiPETANSVTLSDLQPGVQYN 

I T I YAVEENQES TP W I QQ ETTGT PRS DT VPS PRDLQ FVE VTD V 

KVT I MWTP P ES AVTG YR VD VI P VNLPGEHGQRL PLSRNTF \ AEN 

TGLS PGVTYYFKVFAVSHGRES KPLTAQQTTKL \DAP TNLQFVN 

ETDSTVLVRWTPPRAQITGYRLTVGLTRRGQPRQYNVGPSVSKY 

PLRNLQPASE YTVSLVAI KGNQES PKATGVFTTLQPGSS I PPYN 

TEVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 

VSGLTPG VE YVYT I Q VLRDGQERDAP \ I VNK\ WTPLS P PTNLH 

LEANPDTGVLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEVV 

HADQSSCTF\DNLEVPGLEYNVSVYTVKDDKESVPISDTI I PAV 

PPPTDLRFTK/ ILGPDTMRVTW\APPPS IDLTNFLVRYSP VKNE 

GRMLQSLS I FFLSDN\AWLTNLLPGTEYWSVS SVYEQHESTP 

\LRGRQKTGLDSP\TGIDFS\DITA\NSFT\VHW\IAPRA/TPI 

TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITLTNLTPGTEYW 

SIVALNGREES PLLIGQQSTVSDVPRDLEWAATPTSLLI \SWD 

APAVTVRYYRI TYGETGGNS P VQEFTVPGS KSTATISGLKPG VD 

YTITVYAVTGRGDSPASSKPISINYRTEIDKPSQMQVTDVQDNS 

ISVKWLPSSSPVTGYRVTTT\PKNGPG\PTKTKTAGPDQTEMTI 

EGLQPTVEYVVSVYAQNPSGESQPLVQTAVTNIDRPKGLAFTDV 

DVDSIKIAWESPQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 

ELQGLRPGSEYTVSWALHDDMESQPLIGTQSTAIPAPTDLKFT 

QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINLAPDSS 

S WVS GLMVATK YEVS VYAL KDTLTS R PAQG WTTLENVS P PRR 

ARVTDATETT I T I SWRTKTETI TGFQVDAVPANGQTPIQRT I KP 

DVRSYTITGLQPGTDYKIYLYTLNDNARSSPWIDASTAIDAPS 

NLRFLATTPNSLLVSWQPPRARITGYIIKYEKPGSPPREWPRP 

RPGVTEATITGLEPGTEYTIYVIALKNNQKSEPIilGRKKTDELP 

QL VTL PHPNLHG P E I LDVP STVQKTP FVTHPG YDTGNG I QL PGT 

SGQQPSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 

ALSQTTISWAPFQDTSEYIISCHPVGTDEEPLQFRVPGTSTSAT 

LTGLTRGATYNI I VEAL KDQQRHKVREE VVTVGNS VNEGLNQPT 

DDSCFDPYTVSHYAVGDEWERMSESGFKLLCQCLGFGSGHFRCD 

SSRWCHDNGVNYKIGEKWDRQGENGQMMSCTCLGNGKGEFKCDP 

HEATCYDDGKTYHVGEQWQKE YLGAI CS CTCFGGQRGWRCDNCR 

R PGGE PS PE GTTGQS YNQ Y S QR YHQRTNTNVNCP I EC FMPLD VQ 

ADREDSRE 


5367 


235 


3591 


KKILNMLCKKNIVIEYLADILYEYLYGFCFSGIKKYLIIHVLRL 
ILELWMTRLLLEKSVSLQTQYLLLIVKILSWFPGKEMRHHLQIM 
EVMMRKQDS/RIVGNGSEQQLQKELADVLMDPPMDDQPGEKELV 
KRSQLDGEGDGPLSNQLSASSTINPVPLVGLQKPEMSLPVKPGQ 
GDSEAS S P FTPVADEDS WFS KLTYLG CAS VNAPRSE VEALRMM 
S I LRSQCQI SLDVTLSVPNVS EGI VRLLDPQTNTE I ANYPI YKI 
LFCVRGHDGTPESDCFAFTESHYNAELFRIHVFRCEIQEAVSRI 
LYS FATAFRRS AKQTP LS ATAAPQTPDSD I FTFSVS LE I KEDDG 
KG YFSAVPKDKDRQCFKLRQG IDKKIVI YVQQTTNKELAI ERCF 
GLLLSPGKDVRNSDMHLLDLESMGKSSDGKSYVITGSWNPKSPH 
FQWNEETPKDKVLFMTTAVDLVI TE VQEP VRFLLETKVRVCS P 
NERLFWPFS KRSTTENFFLKLKQI KQRER KNNTDTL YE WCLE S 
ESERERRKTTASPSVRLPQSGSQSSVIPSPPEDDEEEDNDEPLL 
SGSGD VS KECAEKI LETWGELLS KWHLNLNVR PKQLS SLVRNG V 
PEALRGEVWQLLAGCHNNDHL VEKYR I L I TKES PQDS A ITRD IN 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to firat 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E- 
Glutamic Acid, F*»Phenyl alanine, G«Glycine, 
H-Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine , R^Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








RTF PAHDY F KD TGGDGOD S LY KI C KAYS VYDEE IGYCQGQ S FLA 
AVLLLHMPEEQAFSVLVKIMPDYGLRELPKQNFEDLHCKPYQLE 
RLMQEYIPDLYTOFLDISLEAHMYASQWFLTLFTAKFPLYMVFH 
I IDLLLCBGISVIFNVALGLLKTSKDDLLLTDFEGALKFFRVQL 
PKRYRSEENAKKLMELACNMKI SQKKLKKYEKEYHTMREQQAQQ 
EDPIERFERENRRLQEANMRLEQENDDLAHELVTSKIALRKDLD 
NAEEKADALNKELIi^KQKLIDAEEEKRRLEEESAHLraCMCRRE 
LDKAESE I KKNS S I IGD YKQICSQLS ERLEKQQTANKVE I EKI R 
QKVDDCERCREFFNKEGRVKG I S S TKEVLDEDTDEE KETLKNOL 
REMELELAQTKL\QLVEAECK10D\LEHPF*GLPFNE\VQAA\K 
KTWFNRTLSS I KTATG VQGKETC 


5368 


573 


2014 


GAAAGAADPRRGSLGGRTMLDFAI FAVT FLLALVGAVLYLYPAS 
RQAAGI PGI TPTEEKDGNLPD I VNSGSLHEFLVNLHERYGPWS 
F W FGRRL WS LGTVD VLKQHINPNKTLD / D F * NHAEVI I KVS I W 
WWQCE * KP \ QRKKLYENG VTDS LKSNFALLLKLPEELLDKWIiS Y 
PETQH\VPLSQHMLGFAMKSVTQMVMGSTFEDDQEVIRFQKNHG 
TVWSEIGKGFLDGSLDKNMTRKKQYEDALMQLESVLRNIIKERK 
GRNFSQHI F I DS LVQGNLNDQQ I LEDSM 1 FS LAS CI I TAKLCTW 
AIWFLTTSEEVQKKLYEEINQVFGNGPVTPEKIEQLRYCQHVLC 
ETVRTAKLTPVSAQLQDIEGKIDRFIIPRETLVLYALGWLQDP 
NTW P S P HKFD PD RFDDEL VMKT FS S LG FSGTQE CPELR FAYMVT 
TVLliSVLVKRLHLLSVEGQVIETKYELVTS SREEAWITVS KRY 


5369 


1 


6622 


PRSLCFSLWAEAAVLADGGLRRRRRLLRGTMSASFVPNGASLED 
CHCNLFCIiADLTG I KWKKYVWQG PTSAP ILFP VTEED P I LS S FS 
RCLKAD VLG / VWRRDQRP ERRE \ L> * I FWGGEDP \ VLLTLFTMTY 
QKKKMECGRMDFPMNAVLCFSKAVHNLLERCLMNRNFVRIGKWF 
VKPYEKDEKPINKSEHLSCSFTFFLHGDSNVCTSVEINQHQPVY 
LLSEEHITLAQQSNSPFQVILCPFGLNGTLTGQAFKMSDSATKK 
L IGE WKQFYP I S CCLKEMSEEKQEDMDWEDDS LAAVEVLVAG VR 
MIYPACFVLVPQSDIPTPSPVGSTHCSSSCLGVHQVPASTRDPA 
MSSVTLTPPTSPEEVQTVDPQSVQKWVKFSSVSDGFNSDSTSHH 
GGKI PRKLANHVVDRVWQECNMNRAQNKRKY&ASSGGLCEEATA 
AKVAS WD FVE ATQRTNCS CLRHKNL KS RNAGQQGQAPS LGQQQQ 
ILPKHKTNEKQEKSEKPQKRPLTPFHHRVSVSDDVGMD\ADS\A 
SQRLV\ISAP\DSQ\VRFSNIR\TNDVAK\TPQMHGTEMANSPQ 
PPPLSP\HPCDWDEGVTKTPSTPQSQHFYQMPTPDPLVPSKPM 
EDRIDSLSQSFPPQYQEAVEPTVYVGTAVNLEEDEANIAWKYYK 
FPKKKDVEFLPPQLPSDKFKDDPVGPFGQESVTSVTELMVQCKK 
PLKVSDELVQQYQIKNQCLSAIASDAEQEPKIDPYAFVEGDEEF 
LFPDKKDRQNSEREAGKKHKVEDGTSSVTVLSHEEDAMSLFSPS 
IKQDAPRPTSHARPPSTSLIYDSDLAVSYTDLDNLFNSDEDELT 
PGSKRSANGSDDKASCKB SKTGNLDPLS CISTADLHKMYPTPPS 
LEQH I MG FS PMNMNNKE YGSMDTTPGGTVLEGNS SS IGAQFKI E 
VDEGFCSPKPSEIKDFSYVYKPENCQILVGCSMFAPLKTLPSQY 
LPLIKLPEECIYRQS WTVGKLELLS SG P S MP P I KEGDGSNMDQE 
YGTAYTPQTHTSCGMPPSSAPPSNSGAGILPSPSTPRFPTPRTP 
RTPRTPRGAGGPASAQGSVKYENSDLYSPASTPSTCRPLNSVEP 
ATVPSIPEAHSLYVNLILSESVMNLFKDCNSDSCCICVCNMNIK 
GADVGVY I PDPTQEAQYRCTCGFSAVMNRKFGNNSGLFFEDELD 
I IGRNTDCGKEAEKRFEALRATSAEHVNGGLKESEKLSDDLILL 
LQDQCTNLFS PFGAADQDPFPKSGVI SNWVRVEERDCCNDCYLA 
LE HGRQ FMDNMS GG KVDEALVKS S CLH P W S KRND VS MQCS QD I L 
RMLLSLQPVIiQDAIQKKRTVRPWGVQGPLTWQQFHKMAGRGSYG 
TDESPEPLPI PT FLLG YD YD YL VLS P FAL P YWER LMLEPYGS QR 
D I AYWLCPENEALLNGAKS FFRDLTAI YESCRLGQHRPVSRLL 
TDG I MR VGS TAS KKLS EKLVAEW FS GAADGNNEAFS KLKL YAQV 
CRYDLGPYLASLPLDSSLLSQPNLVAPTSQSLITPPQMTNTGMA 
NT PS ATLAS AASSTMTVTSGVAI STS VATANSTLTTASTS SS S S 
SNLNSGVSSNKLPSFPPFGSMNSNAAGSMSTQANTVQSGQLGGQ 
CTS ALQTAG I SGESS S LPTQPHPDVSES TMDRDKVG I PTDGDSH 
AVTYPPAIVVYIIDPFTYENTDESTNSSSVWTLGLLRCFLEMVQ 
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Amino acid segment containing signal peptide 
(A»Alanine, OCysteine, D=Aspartic Acid, E- 
Glutamic Acid, F-Phenylalanine, G*Glycine, 
H-Histidine, I-Isoleucine , K«Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Axginine, 
S=Sertne, T-Threonine, V=Valine, 
W*Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /=pos sible nucleotide deletion, 
\ s poseible nucleotide insertion) 








TLP PH I KS TVS VQ IIP CQ YLLQP VKHEDREI YPQHLKS LAFSAF 
TQCRRPLPTSTNVKTLTGFGPGLAMETALRS PDRPECI RLYAPP 
FIIAPVKDKQTELGETFGEAGQKTNVLFVGYCIiSHDQRWILASC 
TDbYGELLETCIINIDVPNRARRKKSSARKFGLQKLWEWCLGLV 
QMSSLPWRWIGRLGRIGHGELKDWSCLLSRRNLQSLSKRLKDM 
CRMCGISAADSPSILSACLVAMEPQGSFVIMPDSVSTGSVFGRS 
TTLNMQTSQLNTPQDTSCTHILVFPTSASVQVASATYTTENLDL 
AFNPNNDGADGMGIFDLLDTGDDLDPDIINILPASPTGSPVHSP 
GSHYPHGGDAGKGQSTDRIiLSTEPHEEVPNIIiQQPLALGYFVST 
AKAG P LPDW FW SAC PQAQ YQC P L F LKAS LHLHVPS VQSDE LLH S 
KHSHPLDSNQTSDVLRFVLEQYNALSWLTCDPATQDRRSCLPIH 
FWLNQLYNFIMNML 


5370 


1226 


716 


RWSRKLELRRAAQATESRPPQSQEMHPPTGKEVHALKRLRDSAN 
ANDVErVQQLLEDGADPCAADDKGRTALHFASCNGNDQIVQLLL 
DHGADPNQRDGLGNTPLHIiAACTNHVPVITTUiRGGARVDALDR 
AGRTP LHLAKS KLNI LQEGHAQCLKAVR /HGGEADHP YAEGVSG 
APRAT*AARCSGVFPSPSRWLGSAPWSRSSCTIWSLPLHEAKCR 
AVRPLSSAAQGS APSS S S CCTVSTSLALAES LSLFRACTS LPVG 
GCISWL 


5371 


1331 


167 


IAAMLWKLLLRSQSCRLCSFRKMRSPPKYRPFLACFTYTTDKQS 
S KENTRTVE KLYKCS VD IRKI RR\ * KDGYF* RMKPMLKKLRI / F 
LQELGADETAVASILERCPEAIVCSPTAVNTQRKLWQLVCKNEE 
ELI KLI EQ FPES FFT I KDQENQKLNVQFFQELGLKNVVI SRLLT 
AAPNVFHNPVEKNKCMVRILQESYUDVGGSE^^KVWLLKLLSQ 
NPFILLNS PTAIKETLEFLQEQGFTSFEILQLLSKLKGFLFQLC 
P R S I QNS I S FS KNAFKCTDHDL KQLVLKCPALL YYS VP VLE ERM 
QGLLREGI S IAQI RETPMVLELTPQI VQYRIRKLNSSGYR I KDG 
KLANLNGSKKEFEANFGKIQAKKVRPLFNPVAPLNVEE 


5372 


51 


G57 


SPGAQFLWAAPDKPDPLFSAVQGKDEILHKALCFCPWLGKGGME 
PLRLLI LLF VTELSGAHNTTVFQGVAGQSIiQVS CPYDSMKHWGR 
RKAWCRQLGEKGPCQRWSTHNLWLIiSFLRRWNGSTAITDDTLG 
GTLTITI*RNLQPHDAGLYQCQSLHGSE1ADTLRKVLVEVIjADPI^ 
HRDAGDLWFPG\DLRASRMPMWSTAS PGAS WKEKS PSHPLPS FS 
SWPASFSSRF*QPAPSGLQPGMDRSQGHIHPVNWTVAMTQGISS 
KLCQG 


5373 


2814 


346 


VKKTKS I FNSAMQEMEVYVENIRRKFGVFNYSP FRTPYTPNS QY 
QMLLDPTNPSAGTAKIDKQEKVKLNFDMTASPKILMSKPVLSGG 
TGRRI SLSDMPRS PMSTNSSVHTGSDVEQDAEKKATSSHFSASE 
ESMDFLDKSTASPASTKTGOAGSLSGS PKPFS PQLSAP ITTKTD 
KTS TTGS I LNLNLDRS KAEMDLKE LSES VQQQS TP VP L I S PKRQ 
I RS RFQLNLDKT I ESCKAQLG INE I SEDVYTAVEHSDS EDS EKS 
DSSDSEYISDDEQKS*GTSQEDTEDKEGCQMDKEPSAVKKKPKP 
TNPVE I KEE LKSTSPASE KADPGAVKDKAS PE PEKDFSGKAKPS 
PHPIKDKLKGKDETDSPTVHLGLDSDSE\NELVIDLGEDHSGRE 
GR KNKKE P KE PS P KQD WG KTP P S TTVGSHS P PET P VLTRS S AQ 
TSAAGATATTSTS STVTVTAPAPAATGSP VKKQR PLLPKE \TAP 
AVQRSCGTS STVQQKE I TQS PSTST ITLVTSTQSS PLVTSSGSM 
STLVSSVNGDLPIGTASADVAADIAKYTSKL\MDAIKGTM\TEI 
YNDLS KN\TTW KAQLAEDSQGLRI E I EKLQWLHQQEL\ SEMKHN 
LELTMAEMRQSWEQEJIDRLIAEVKKQLELEKQQAVDETKKKQWC 
ANFKKEAI FYCCWNTS YCDYPCQ\ QAHWPEH\MXS CTQSATAPQ 
\QEADAE\VNTETLNKSSQGSSSSTQSAPSETASA\SKEKETSA 
EKSKESGSTLDLSGSRETPSSILLGSNQGSDHSR\SNKSSWSSS 
DE KRGS \ TRS DHN / TPSTQHGRS LL PGKESRAGTP FLGTS K 


5^3*74 


2814 


346 


VKKTKS I FNSAMQEMEVYVENIRRKFGVFNYSPFRTPYTPNSQY 
QMLLDPTNPSAGTAKIDKQEKVKLNFDMTASPKILMSKPVLSGG 
TGRRISLSDMPRSPMSTNSSVHTGSDVEQDAEKKATSSHFSASE 
ESMDFLDKSTASPASTKTGQAGSLSGS PKPFS PQLSAP ITTKTD 
KTSTTGS I LNLNLDRS KAEMDLKELS ES VQQQSTP VPL I S PKRQ 
IRSRFQLNLDKTIESCKAQLGINEISEDVYTAVEHSDSEDSEKS 
DSSDSEYISDDEQKS*GTSQEDTEDKEGCQMDKEPSAVKKKPKP 
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Amino acid segment containing signal peptide 
(AaAlanine, c=Cysteine, D-Aspartic Acid, E» 
Glutamic Acid, F= Phenylalanine , G^Glycine, 
H=Histidine, I=Isoleucine, K« Lysine, 
L- Leucine, M-Methionine , N-Asparagine , 
p=Proline, Q-Glutamine, R-Arginine, 
S=Serine, T* Threonine, V= Valine, 
W=Tryptophan, Y^Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 








T!*f P VE 1 KE £ L TS FAS £ KAD PG AVxtDKAS PEPEKDPSG KAKPS 
PHPIKDKLKGKDETDSPTVHLGLDSDSE\NELVIDLGEDHSGRE 
GRKNKKEPKEPSPKQDWGKTPPSTTVGSHSPPETPVLTRSSAQ 
TS AAG ATATTS TS STVTVTAPAP AATGS P VKKQRPLLP KE \TAP 
AVQRSCGTSSTVQQKEITQSPSTSTITLVTSTOSSPLVTSSGSM 
STLVSSVNGDLPIGTASADVAADIAKYTSKL\MDAIKGTM\TEI 
YND1jSKN\TTWKAQLAEDSQGLRIEIEKL<5WLHQQEL\SEMKHN 
LELTMAEMRQSWEQERDRLIAEVKKQLELEKQQAVDETKKKQWC 
ANFKKEAIFYCCWNTSYCDYPCQ\QAHWPEH\MKSCTQSATAPQ 
\QEAD AE \ VNTE T LNKS S QG S S S S TQ S AP SETAS A\ S KE KETS A 
EKSKESGSTLDLSGSRETPSSILLGSNQGSDHSR\SNKSSWSSS 
DEKRGS\TRSDHN/TPSTQHGRSLLPGKESRAGTPFLGTSK 


5375 


2907 


1116 


"HIFLAEEEPMLERRCRGPLAMGPAQPRLLSGPSQESPQTLGKES 
RGLRQQGTSVA\QSGAQAPGRAHRCAHCRRHFPGWVA\LWLHTR 
RCQA/RGLPLPCPECGRRFRHAPFLALHRQVHAAATPDWGFACH 
LCGQS FRGWVALVLHLRAHSAAKAGPFACPKMARDAFWRRKAAS 
SSILRRCHPSRPRGPRPFICGNCGRSILPTWDQ/LKVAHKRVHV 
SRRP*ERGPPAKVFWGPRPRGPPTGDTPPGPGGDAVDRPF\QCA 
CCGKR FRH K \ PNL I RS HAACTSG ER PHQ / CS RE CG \ KRFTNKP Y 

lts\hrrithtarqpypckecgrrfrhkpnllshskihkrsegs 
aqaapgpgspqlpagpqesaaeptpavplkpaqepppgappehp 
qdpi eappslys cddcgrs frlerflrahqrqhtgerp ftcaec 
gknfgkkthlvahsrvhsgerpfrlarkcgrrflprasqsggrn 
saepnaprfgpfvcpdcgkafrhkpylaahrpiatpaekpyvcp 
dcrkafsqksnlWshrrihtgerpyacpdcdrsfsqksnlith 

RKSH I RDGAFCCAI CGQTFDDEERLLAHQKKHDV 


5376 


4504 


591 


vstfslclwpaggggrgrvsnmaqskrhvysrtpsgsrmsaeas 
arplrvgsrvevigkghrgtvayvgatlfatgkwvgvildeakg 
kndgtvqgrkyftcdeghgi fvrqsq I qvfedgadtts PETPDS 
saskvlkregtdttaktsklrglkpkkaptarktttrrpkptrp 
astgvagassslgpsgsasagelsssepstpaqtplaapiiptp 

VLTS PGAVP PLPS P SKEEEGLRAQ VRDLEE KLETLRLKRAEDKA 
KLKELEKHK IQLEQVQEWKS KMQBQQADLQRRLKEARKKAKEAL 
EAKERYMEEMADTADAI EMATLDKEMAEERAES LQQEVEALKER 
VDELTTDLEILKAEIEEKGSDGAASSYQLKQLEEQNARLKDALV 
RMRDLSSSEKQEHVK\LQKLMEKKNQELEWRQQRERLQEELSQ 
AESTIDELKEQVDAAI^AEEMVEMLTDRl^XEEKVREIJ^ETVG 
DLEAMNEMNDELQENARETELELREQLDMAGARVREAQKRVEAA 
QETVADYQOT I KKYRQLTAHLQDVNRELTNQQEAS VERQQQPPP 
ETFDFKI KFAETKAHAKA I EMELRQMEVAQANRHMS LLTAFMPD 
SFIiRPGGDHDCVLVTiLLMPRLI CKAELI RXQAQEKFELSENCS E 
RPGLRGAAGEQLSFAAIGLVY\SLMPAAGHRYHRY*CHALSQCR 
LD\VY KKVGS LY PEMSAHERSLDFL IELLHKDQLDETVNVE PLT 
KA I KYYQHL YS I HLAEQPE DCTMQLADH I KFTQS ALDOIS VE VG 
RLRAFLQGGQEATD I ALLLRDLETSCS \DIRQFCKKIRRRMPGT 
DAPGIPAAIAFGPQVSDTLLDOIKHLTWVVAVLQEVAAAAAQLI 
APLAENEGLLVAALEELAFKASEQIYGTPSSSPYECLRQSCNIL 
ISTMNKNLNH'AMQEGEYDAERPPSKPPPN^IjRAAALRAEITDA 
EGLG L KLE DRE T V I KE LKKS LKI KGE ELS EANVRLTLLE KKLDS 

aakdaderiekvqtrleetqallrkkekefeetmdalqadidql 
eaekae lkqrlnsqs krt i eg lrgpppsg iatlvs g i ageeqqr 
gaipgqapgsvpgpglvkdsplllqqisamrlhisqlqhensil 
kgaqmkaslaslpplhvaklshegpgselpagalyrktsqllet 
lnqlsththwditrtspaakspsaqlmeqvaqlkslsdtvekl 
kdevlketvsqrpgatvptdfatfpssaflrakeeqqddtvymg 
kvtfs caagfgqrhrlvltqeqlhqlhsrli s 


5377 


762 


1106 


dvpckrvlpaeaqekgqltlscgesgeeg\f*yhevrqaeges* 
/wfgpnvrlvhtqlktkkpsgtlkakfyuttgstkfaarisctk 
s s * wpg ydgwwggqyi F I frgmrweeq P 


537B 


2009 


664 


- qasgttlrplpdlpqlkrreatsrnralkprtirlvlmtsclpal 
rfiatprlsamphidndvkldfkdvllrpkrstlksrsevdltr 
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Predicted end 
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Amino acid segment containing signal peptide 
{A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I^leoleucine," K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
PaProline, Q«Glut amine, R=*Arginine, 
S=Serine, T=»Threonine, V-Valine, 
W^Tryptophan, Y-Tyroeine, X- Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








S FS FRNS KQTYSGVP 1 1 AANMDTVG TF EMAKVLC KS * VPGS FWD 
VPQMGCVFLI YKLFTLKWKMLLLSVLLPAS ILVAEKFSLFTAVH 
KHYSLVQWQEFAGQNPDCLEHLAASSGTGSSDFEQLEQILEAIP 
Q VKY ICLDVANG YSEH FVEFVKDVRKR FPQHTI MAGNWTGEM V 
EEL I LSGAD 1 1 KVG 1 G PGS VCTTRKKTGVGYPQLSAVMECADAA 
HGLKGHI I SDGGCSCPGDVAKAFGAGADFVMLGGMLAGHSESGG 
EL I ERDGKKYKLF YGMS S * I \ AM \ KKYAGGVAE YRAS EGKTVEV 
PFKGDVEHTIRDILGGIRSTCTYVGAAKLKELSRRTTFIRVTQQ 
VNPIFSEAC 


5379 


2009 


664 


QASGTTLRPLPDLPQLKRREATSRNRALKPRGRLVLMTSCLPAL 
RFIATPRLSAMPHIDNDVKLDFKDVLLRPKRSTLKSRSEVDLTR 
S FS FRNSKQTYSGVP 1 1 AANMDTVGTFEMAKVLCKS * VPGSFWD 
VPQMGCVFLI YKLFTLKWKMLLLSVLLPAS ILVAEKFSLFTAVH 
KHYSLVQWQEFAGQNPDCLEHLAASSGTGSSDFEQLEQILEAIP 
Q VKY I CLD VANGYS EHFVEF VKD VRKRFPQHTI MAGNWTGEMV 
EEL I LSGAD 1 1 KVG IGPGS VCTTRKKTGVGYPQLSAVMECADAA 
HGLKGHI I SDGGCS CPGDVAKAFGAG AD FVMLGGMLAGHSESGG 
EL I ERDGKKYKLFYGMS S * I \AM\ KKYAGGVAE YRAS EGKTVEV 
PFKGDVEHT I RDI LGGI RSTCT YVGAAKLKELSRRTTFI RVTQQ 
VNPIFSEAC 


5380 


2 


2050 


P S RAGG AERG RAAAARS PGGS AAGWE CPS VLDEAG ACTMS SCVS 
SQPSSNRAAPQDELGGRGSSSSESQKPCEALRGLSSLSIHLGME 
S F I WTECEPGCAVDLGLARDRP LEADGQE VPLDTSGSQARPHL 
SGRKLSLQERSQGGLAAGGSLDMNGRCICPSLPYSPVSSPQSSP 
RLPRRPTVES HHVS I TGMQDCVQLNQYTLKDE I GKGS YG WKLA 
YNENDNTYYAMKVLSKKKLIRQAAFPRRPPPRGTRPAPGGCIQP 
RGPl\EQVYQEIA\lLKKLDHPNW\KLVEVL\DDPNEDHLYMV 
F\ELVNQGPVMEVPTLKPLSEDQARFYFQDLIKGIEYLHYQKII 
H\ RD I KPSNLL VGEDGHI KI AD FGVSNEFKGSDALLSNTVGTPA 
FMAPESLSETRKIFSGKALDVWAMGVTLYCFVFG*CPFMDERIM 
CLHSKIKSQALEFPDQPDIAEDLKDLITRMLDKNPESRIWPEI 
KLH P WVTRHGAE PL P S EDENCTLVEVTE E E VENS VKH IPS LATV 
I L VXTM IRKR S FGNP FEGS RREERS LS APGNLLTKKPTRE CES L 
SELKT*KISPLPACCKVT*EFPHPSGCRPSCWQPPFLHTHSQPR 
*PEPPRTDEALCPYETGRTCWAPLLQVLWWVGTPLPFPLSTSWL 
PDLVGAPGSHFCFLNIALLRYNSHTM 


5381 


2 


2050 


psraggaergraaaars pggsaagwecps vldeagactms s cvs 
sqp s s nraapqd e lggrgs s s s e s qkpcealrgls s ls i hlgme 
sfiwtecepgcavdlglardrpleadgqevpldtsgsqarphl 
sgrklslqersqgglaaggsldmngrcicpslpyspvsspqssp 
rlprrptveshhvs i tgmqdcvqlnqytlkde igkgsygvvkla 
ynendntyyamkvlskkklirqaafprrppprgtrpapggciqp 
rgp i \eqvyqe ia\ ilkkldhpnvv\klvevl\ddpnedhlymv 
f\elvnqgpvmevptlkplsedqarfyfqdlikgieylhyqkii 

H\RD I KPSNLL VGEDGH I KIADFGVSNEFKGSDALLSNTVGTPA 

fmapeslsetrkifsgkaldvwamgvtlycfvfg*cpfmderim 
clhskiksqalefpdqpdiaedlkdlitrmldknpesriwpei 
klhp wvtrhgaep lp sedenctlvevteeevens vkh i ps latv 
ilvktm i rkrs fgnp fegsrreers ls apgnlltkkptre cesl 
selkt*kisplpacckvt*efphpsgcrpscwqppflhthsqpr 

♦PEPPRTDEALCPYETGRTCWAPLLQVLWWVGTPLPFPLSTSWL 
PDLVGAPGSHFCFLNIALLRYNSHTM . 


5382 


1536 


203 


GARGSQQDAPALQEABVRGPERAQPARGRMTKARLFRLWLVLGS 
VFM I LL I IVYWDS AGAAH FYLHTS FSRPHTGP PLPTPG PDRDRE 
LTADS D VDE FLDKFLS AGVKQSDLPRKETEQ P PAPGSMEES VRG 
YDWS P RDARRS PDQGRQQAERRS VLRGFCANS SLAFPTKERP FD 
DI PNS ELSHLI VDDRHGAI YCYVP KVACTNWKRVMI VLS GSLLH 
RGAPYRDPLRI PREHVHNASAHLTFNKFWRRYGKLSRHLMKVKL 
KKYTKFLFVRDPFVRLI SAFRSKFELENEEF/ * PQVRRAHAAAV 
RQ PHQP ARLGARGLPRW PQ \VS FANF I QYLLDPHTEKLAPFNEH 

WRQ VYRLCH PCQ I D YD FVGKLETLDEDAAQLLQLLQ VDLAAPL P 
_l . — 
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Amino acid segment containing signal peptide 

m-mauiuc/ »_-v_y&LCllJC , >iSparClC AC 1 Q , E 25 

Glutamic Acid, F= Phenyl alanine, G*Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N^Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








PELPGTGPPSSWEEDWFAKIPLAWRQQLYKLYEADFVLFGYPKP 
ENLLRD 


5383 


45 


5250 


VERLL^CRNSKRTWRMLISKNMPWRRLQGISFGMYSAEELKKLS 
VKS ITNPRYLDSLGNPSANGLYDLAIjGPADSKEVCSTCVQDFSN 
CSGHLGH I EL P LTVYNPL L FD KL YLLIiRG S CLNCHMLTC PRAV I 
HLLLCQLRVLEVGALQAVYELERI LSRFLEENADPSASE IREEL 
EQ YTT E I VQNNLLGSQGAHVKNVCES KS KL I AL FW KAHMNAKRC 
PHCKTGRSWRKEHNSKLTITFPAMVHRTAGQKDSEPLGIEEAQ 
IGKJiGYLTPTSAREHLSAIiWKNEGFFLNYLFSGMDDDGMESRFN 
PS VFFLDFLWP PSRS RPVSRLGDOMFTNGQTVNLQAVMKDVVL 
IRKLLALMAQEQKLPEEVATPTTDEEKDSLIAIDR5FLSTLPGQ 
SLIDKLYNIWIRLQSHVNIVFDSEMDKLMMDKYPGIRQILEKKE 
GLFRKHMMGKRVDYAARS VICPDMYINTNE IG I PM VFATKLTYP 
QPVTPWNVQELRQAVINGPNVHPGASMVINEDGSRTALSAVDMT 
QRE AVAKQLLTP ATGAP KP QGT K I VCRHVKNGD I LLLNRQ P TLH 
RPS IQAHRARI LPEEKVLRLHYANCKAYNADFDGDEMNAHFPQS 
ELGRAEAYVLACTDQQ YLVPKDGQPLAGL I QDHMVSGAS MTTRG 
CFFTREHYMELVYRGLTDKVGRVKLLSPS ILKP FPLWTGKQWS 
TLLINIIPEDHIPLNLSGKAKITGKAWVKETPRSVPGFNPDSMC 
ESQVIIREGELLCGVLDKAHYGSSAYGLVHCCYEIYGGETSGKV 
LTCLARLFTA YLQL YRG FTLGVED I L VKP KADVKRQRI I E ESTH 
CG PQ AVRAALNL P EAAS YDE VRG KWQDAH LGKDQRD FNM I DLKF 
KEEVNH YSNE I NKACMP FGLHRQFPENTLQLMVQSGAKGSTVNT 
MQ I S CLLGQ I E LEG RS T P LMASGKSLP CFE PYE FTPRAGG F VTG 
RFLTGI KPPEFFFHCMAGREGLVDTAVKTSRSGYLQRCI I KHLE 
GLWQYDLTVRDSDGSWQFLYGEDGLDIPKTQFLQPKQFPFLA 
SNYEVIMKSQHLHEVLSRABPECKALHHFRAIKKWQSKHPNTLLR 
RGAFLSYSQKIQEAVKALKLESENRNGR/RPWDS/G/RMLRMWY 
ELDEE SRRKYQKKAAACPDPSLS VMR PDI YFAS VS ETFETKVDD 
YSQBWAAQTEKSYEKSELSLDRLRTLLQL\KWQRSLCEPGEAVG 
LLAAQSIGEPSTQMTLNTFH?AGRGEPWVTLGIPRIjREILMVAS 
ni\±j\ j. ymms VP v JjN i ivivAJjKRVKSLKKQLTRVCI^EVIjQKIDVQ 
ESFCMEEKQNKFQVYQLRFQFLPHAYYQQEKCLRPEDILRFMEr 
RFFKLLMESIKKKNNKASAFRNVNTRRATQRDLDNAGELGRSRG 
EQEGDEEEEGHIVDAEAEEGDADASDAKRKEKQEEEVDYESEEE 
EEREGEENDDEDMQEERNPHREGARKTQEQDEEVGL/GH*GGPV 
PSRPPDAAPETHPQPGAPGA\EAMERRVQAVREIHPFIDDYQYD 
i. acounLyv i v ir jjiyiis. ± pt r JJrloo JL» V V a J-iAHtaAV 1 Y ATKG I TRC 
LUTOTTNKKNEKELVLNTEGINLPELFKYAEVLDLRRLYSNDIH 
A I ANT YG I E AALR VI EKE I KDVFAVYG I AVDPRHL S LVADYMCF 
EGVYKPLNRFGIRSNSSPLQQMTFETSFQFLKQATMLGSHDELR 
SPSACLWGKWRGGTGLFELKQPLR 


5384 


196 


886 


QS03QRLPTVL*L*GPPGSCPCILSLF\PGRFHALPEIRPYINI 
TILKGDKGDPGPMGLPGYMGREGPQGEPGPQGSKGDKGEMGSPG 
APCOKRFFAFSVGRKTALHSGEDFQTLLFERVFVNLDGCFDMAT 
GQFAAPLRG I YFFSLNVHS WNYKETYVHI MHWOKEAVT r. YflOPQ 

ERSIMQSQSVMLDLAYGDRVWVRLFKRQRENAIYSNDFDTYITF 
SGHLIKAEDD 


5385 


326 


799 


LMVPRTKKEAPAPPKAEAKAKAL\KAKKAVLKDVHSHKKNKIHM 
S PTFRR PKTL * LRRQP KYPWKS TPRRNKLDHHVT I KFPLTTE * A 
VXKIE1WSLLVFTVDVKANKHQIKQAVKK/LCDIDVAKVNTLIQ 
S DG ERKAYVRLA PJD YDALWATKI GI T 


5386 


326 


799 


LMVPRTKKE APAP PKAEAKAKAL \ KAKKAVLKDVHSHKKNKIHM ' 
S PT FRR P KTL * LRRQP KYPWKS TPRRNKLDHHVI I KFPLTTE * A 
VKKIE^SLLVFTVDVKANKHQIKQAVKK/liCDIDVAKVNTLIQ 
SDGERKAYVRLAPDYDALWATKIGIT 


5387 


2 


2117 


FWAASGGCWFVLGERRAGSLLSASYGTFAMPGMVLFGRRWAIA 
S DD L VFPG FFELWR VLWW IG I LTL YLMHRGKLDCAGGALLSS Y 
L I VLMILLAWICTVSAI MCVSMRGTI CNPGPRKSMSKLL YIRL 
ALFFPEMVWASLGAAWADGVQCDRTWNG 1 1 ATWVSWI I IAA 
TWSIIIVFDPLGGKMAPYSSAGPSHLDSHDSSQLLNGLKrAAT 
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ID 

NO: 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
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corresponding 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, I«Isoleucine, K-Lysine, 
L*=Leucine, M-Methionine , N=Asparagine , 
P«Proline, Q-Glutamine, R-Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, **Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SVWETRIKLLCCCIGKDDHTRVAFSSTAELFSTYFSDTDI/VPSD 
I AAG LALLHQQQDN I RNNQEP AQ WCHAPG S S QEADLD AE LKNC 
HHYMQFAAAAYGWPLYIYRNPLTGLCRIGGDCCRSKNPQTMT/M 
VGGDQLQL/CTSAPILHTHRAAVQGLHPRQLPWTRFTELPFLVA 
LDHR KES VWAVRGTMS LQD VLTDLS AES E VLD VECE VQDRLAH 
KGISQAARYVYQRLINDGILSQAFSIAPEYRLVIVGHSLGGGAA 
ALLATMVRAAYPQVRCYAFSPPRGLWSKALQEYSQSFIVSLVLG 
KDVIPRLSVTNLEDLKRRILRWAHCNKPKYKILLHGLWYELFG 
GNPNNLPTELDGGDQEVLTQPLLGEQSLLTRWSPAYSFSSDSPL 
DSSPKYPPLYPPGRIIHLQEEGASGRFGCCSAAHYSAKWSHEAE 
FSKILIGPKMLTDHMPDILMRALDSWSDRAACVSCPAQGVSSV 
DVA 


5388 


1569 


753 


TADGGAGGGGRRQAGVRRHYLYP FTGG YRRRRAACQAERPAARS 
KDTDLAAYQKGNLGVQLRNMAQETNHSQVPMLCSTGCGFYGNPR 
TNGMCSVCYKEHLQRQNSSNGRISPPVQCTDGSVPEAQSALDST 
SSSMQPSPVSNQSLLSESVASSQLDSTSVDKAVPETEDVQASVS 
DTAQ Q P S EEQS KS LE \NRNKKRIAVS CAGRKWD LLGLNAGVEMF 
TWYTVTQMYTIALTITKQMLKNFVFQQEFKSFGSFHQQLLEYK 
ILEHLQTKN 


5389 


1569 


753 


TADGGAGGGGRRQAGVRRHYLYPFTGGYRRRRAACQAERPAARS 
KDTDLAAYQKGNLGVQLRNMAQETNHSQVPMLCSTGCGFYGNPR 
TNGMCSVCYKEHLQRQNSSNGRISPPVQCTDGSVPEAQSALDST 
SSSMQPSPVSNQSLLSESVASSQLDSTSVDKAVPETEDVQASVS 
DTAQ Q P S EEQS KS LE \NRN KKRIAVS CAGRKW D LLG LN AG VE M F 
TVVYTVTCfflYTIALTITKQMLKNFVFQQEFKSFGSFHQQLLEYK 
XLEHLQTKN 


5390 


217 


1332 


EDPRKLMEDKMWSECEGPEMSLVCLTDFQAHAREQLSKSTRDFI 
EGGADDS I TRDDN I AAF KR I RLR P R YLRD VS EVDTRTT I QGE E I 
SAPICIAPTGFHCLVWPDGEMSTARAAQAA\GICYITSTFASCS 
LED I VIAAPEGLR W FQL YVHPDLQLNKQL I QRVESLGFKALV I T 
LDTP VCGNRRHD IRNQLRRNLTLTDLQS PKKGNAI PYFQMTP I S 
TSLCWDLSWFQSITRLPIILKGILTKEDAELAVKHNVQGIIVS 
NHGGRQLDEVLAS I DALTEWAAVKGK I EVYLDGGVRTGNDVLK 
ALALGAKCI FLGDAI LWALASKGEHGVKE VLN I LTNEFHTSMA\ 
L TGCRS VAEINRHLVQFSRL 


5391 


1 


1292 


VKKAAGR SRG PPTAGGQR CE EAPGTVM E RRLG VRAWVKENRGS F 
QPPVCNKLMHQEQLKVMFVGGPNTRKDYHIEEGEEVFYQLEGDM 
VLRVLEOGKHRDWI RQGE I FLLPARVPHSPQRFANTVGLWER 
RRLETELDGLRYYVGDTMDVLFEKWFYCKDLGTQLAPI IQEFFS 
S EQYRTGKP I PDQLL KE PP F PLS TRS I ME PMS LDA WLDSHHR E L 
QAGTPLSLFGDTYETQVIAYGQGSSEGLRQNVDVWLWQLEGSSV 
VTMGGRRLSLGPWMDSLLVLSWGPSY\AW\ERTQGSVALSVT\Q 
DPACKKS P WGEP S CHGLKAATG VPSTLEVP S LPNNS PS PHYLSV 
YCRCVPHRPAHCCHPPSCPSQPRCHAPGRAAAPHLLWQTQPTAL 
P VLPGGLP PAPLLP I PLSLQ TQCSTS TPRR PS I KAS 


5392 


1 


1623 


I RGSNAQKWGAS GS GGAG PQP D PAG PGG V P ALAAAVLGACE P R 
CAAPCPLPALSRCRGAGSRGSRGGRGAAGSGDAAAAAEWIRKGS 
FIHKPAHGWLHPDARVLGPGVSYVVRYMGCIEVLRSMRSLDFNT 
R TQVTRE AINRLHE AVPG VRGS WKKKAPNKALAS VLGKSNLR FA 
GMS I S I H I STDGLS L S VPATRQVI ANHHMPS I S FASGGDTDMTD 
YVAYVAKDP INQRACH I LECCEGL\AQS I I STVGQAFELRFKQY 
LHS P PKVALPPERLAG PEES AWGDEEDSLEHNYYNS IPGKEP PL 
GGLVDSRLALTQPCALTALDQGPSPSLRDACSLPWDVGSTGTAP 
PGDGYVQADARGP PDHEEHLYVNTQGLDAPE PEDSPKKDLFDMR 
P FEDALKLHECSVAAGVTAAPLPLEDQ WPS P P TRRAP VAPTEEQ 
LRQEPWYHGRMSRRAAERMLRADGDFLVRDSVTNPGQYVLTGMH 
AGQPKHLLLVDPEGWRTKDVLFESISHLIDHHLQNGQPIVAAE 
SELHLRGWSREP 


5393 


2 


982 


GGDSAGMTMETQMSQNVCPRNLWLLQPLTVLLLLASADSQAAAP 
PKAVLKLEPPWINVLQ\EDSVTLTCQGAPQP/ERSDSIQWFHNG 
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Predicted end 
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sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G-Glycine, 
H*Histidine, I«Isoleucine, K«Lysine, 
L-Leucine, M=Methionine , N^Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, VssValine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\spossible nucleotide insertion) 








\NLI PTHTQPS \YRFKANNN\DSGEYTCQTGQTSL\SDPVHLTV 
LSEWLVLQTPHLEFQEGETIMLRCHS\WRDKP\LVKVTFFQNGK 
SQKFSHXjDPTFSIPOANHSHSGDYHCTGNIGYTLFSSKPVTITV 
QVPSMGSSSPMGI I VAWI ATAVAAI VAAWALI YCRKKRI SAN 
STD P VKAAQFE P PGRQM I AI RKRQ LEETNND YETADGG YMTLNP 
RAPTDDDKNIYLTLPPNDHVNSNN 


S394 


2 


982 


GGDSAGMTMETQMSQNVCPRNLWLLQPLTVLLLLASADSQAAAP 
PKAVLKLEPPW INVLQ\EDS VTLTCQGAPQP/ERSDS IQWFHNG 
\NLIPTHTQPS\YRFKANNN\DSGEYTCQTGQTSL\SDPVHLTV 
LSEWLVLQTPHLEFQEGETIMLRCHS\WRDKP\LVKVTFFQNGK 
SQKFSHLDPTFSIPQANHSHSGDYHCTGNIGYTLFSSKPVTITV 
Q VPSMGSSSPMG 1 1 VAW I ATAVAAI VAAWAL I YCRKKR I SAN 
S TDP VKAAQ F E P PGRQM I AI RKRQ LEETNND YE TADGG YMTLNP 
RAPTDDDKNI YLTLPPNDHVNSNN 


5395 


3135 


531 


RASDAKNQEGLLNTRRKSTDSVPISKSTLSRSLSLQASDFDGAS 
SSGNPEAVAIiAPDAYSTGSSSASSTLKRTKKPRPPSLKKKQTTK 
KPTETPPVKETQQEPDEESLVPSGENLASETKTESAKTEGPSPA 
LLEET PLE PAAG PKAAC P LDS ESVEGW P P AS GGGR VQNS PP VG 
RKTLPLTTAPEAGEVTPSDSGGQEDSPAKGHSVRLEFDYSEDKS 
SWDNQQENPPPTKKIGKKPVAKMPLRRPKMKKTPEKLDNTPASP 
PRS PAEPND I P IAKGTYTFD I DKWDDPN FNPFSSTS KMQESPKL 
PQQSYNFDPDTCDESVDPFKTSSKTPSSPSKSPASFEIPASAME 
ANGVDGDGLNKPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQDP 
TPAAT P ETP P V I S AWHATDE E KLAVTNQKWTCMTVDLE ADKQD 
YPQPSDLSTFVNETKFSSPTEELDYRNSYEIEYMEKIGSSLPQD 
DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
ALVNTAAKNQH P VPRGLAPNQE SHLQ VPEKS £ QKELE AMGLGTP 
SEAIE I TAPEGS FASADALLSRLAHP VSLCGALD YLEPDLAEKN 
PPLFAQKLQREAAHPTDVSISKTALYSRIGTAEVEKPAGLLFQQ 
PDLDSALQIARAEIITKBREVSEWKDKYEESRREVMEMRKIVAE 
YEKT I AQM I EDE QRE KS VS \HQTVQQLVLEKEQA\ IiADLNS VE K 
\SLADLFRRYEKMKEVLEGFRKNEEVLKRCAQEYLSRVKKEEQR 
YQALKVHA\EEKLDRANAE\ IAQVRGKAQQEQAAHQASLAERSS 
CRV\DALERTLEQKNKEIEELTKICDELIAKMGKS 


5396 


3135 


531 


RASDAKNQEGLLNTRRKSTDSVPISKSTLSRSLSLQASDFDGAS 
SSGNPEAVALAPDAYSTGSSSASSTLKRTKXPRPPSLKKKQTTK 
KPTETPPVKETQQEPDEESLVPSGENLASETKTESAKTEGPSPA 
LLEETPLEPAAGPKAACPLDSESVEGWPPASGGGRVQNSPPVG 
RKTLPLTTAPEAGEVTPSDSGGQEDSPAKGHSVRLEFDYSEDKS 
SWDNQQENPPPTKKIGKKPVAKMPLRRPKMKKTPEKLDNTPASP 
PRS PAE PND I P I AKGT YT FD I DKWDD PNFNPFS STS KMQES P KL 
PQQSYNFDPDTCDESVDPFKTSSKTPSSPSKSPASFEIPASAME 
ANGVDGDGLNKPAKKKKTPLKTDTFRVKKS PKRS PLSDPPSQDP 
TPAATPETPPV I S AVVHATDEEKLAVTNQKWTCMTVDLEADKQD 
YPQPSDLSTFVNETKFSSPTEELDYRNSYEIEYMEKIGSSLPQD 
DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFBETE 
ALVNTAAKNQHP VPRGLAPNQE SHLQ VPE KS SQ KELEAMGLGTP 
S EAIE I TAPEGS FASADALLSRLAHP VS LCGALDYLEPDLAEKN 
PPLFAQKLQREAAHPTDVS ISKTALYSR I GTAEVE KPAGLLFQQ 
PDLDSALQIARAEI ITKEREVSEWKDKYEESRREVMEMRKIVAE 
YE KT I AQM I EDEQREKS VS \ HQTVQQLVLE KEQA\LADLNS VE K 
\SLADLFRRYEKMKEVLEGFRKNSEVLKRCAQEYLSRVKKEEQR 
YQALKVHA\EEKLDRANAE\ IAQVRGKAQQEQAAHQASLAERSS 
CRV\DALERTLEQKNKEIEELTKICDELIAKMGKS 


5397 


3135 


531 


RASDAKNQEGLLNTRRKSTDSVPISKSTLSRSI^LOASDFDGAS " 
SSGNPEAVALAPDAYSTGSSSASSTLKRTKKPRPPSLKKKQTTK 
KPTETP P VKETQQE PDEESLVPSGENLASETKTESAKTEGPS PA 
LLEETPLEPAAGPKAACPLDSESVEGWPPASGGGRVQNSPPVG 
R KTLPL TTAPEAGE VTP SDSGGQ EDS PAKGHS VRLE FD YS EDKS 
SWDNQQENPPPTKKIGKKPVAKMPLRRPKMKKTPEKLDNTPASP 
PRS PAE PND I P IAKGTYTFDIDKWDDPNFNPFSSTS KMQESPKL 
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to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C*Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F= Phenylalanine, G«Glycine, 
H=Histidine, Islsoleucine, K-Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P-Proline, Q-Glutamine, R=Arginine, 
S-Serine, T»Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\spossible nucleotide insertion) 








PQQSYNFDPDTCDESVDPFKTSSKTPSSPSKSPASFEIPASAME 
ANGVTX5DGLNKPAKKKKTPLKTDTPRVKKSPKRSPLSDPPSQDP 
TPAATPETPPVISAWHATDEEKLAVTNQKWTCMTVDLEADKQD 
YPQPSDLSTFVNETKFSSPTEELDYRNSYEIEYMEKIGSSLPQD 
DDAPKKQALYLMFDTSQESPVK5SPVRMSESPTPCSGSSFEETE 
ALVNTAAXNQH P V P RG LAPNQESHLQVP E KS SQ KELEAMGLGTP 
SEAIE ITAPEGSFASADALLSRLAHPVSLCXSALDYLEPDIiAEKN 
P PLFAQ KLQREAAH P TDVS I S KTAL YS R I GTAEVE KP AGLL FQQ 
PDLDS ALQ I ARAE 1 1 TKERE VS E WKDKYEESRREVMEMRK I VAE 
YEKTIAQM I EDBQREKSVS \HQTVQQLVLEKEQA\ LADLNS VEK 
\SLADLFRRYEKMKEVLEGFRKNEEVLKRCAQEYIiSRVKKEEQR 
YQALKVHA\EEKIiDRANAE\lAQ\^GKAQQEQAAHQASIiAERSS 
CRV\DALERTLEQKNKEIEELTKICDELIAKMGKS 


5398 


56 


5426 


SGEVCRMESNFNQEGVPRPSYVFSADPIARPSEINFDGIKLDLS 
HEFSLVAPNTEANS FESKDYLQVCLR I RP FTQS E KELESEGCVH 
I LDSQTWLKEPQC I LGRLSEXSSG \QM\AQKFS FFPGFLGPAT 
TQKEFFQGCIMHP\VKDLLKGQSRLIFTYGLTNSGKTYTFQGTE 
ENIRILPRTLNVLFDSLQERLYTKMNLKPHR5RBYLRLSSEQEK 
E E I AS KS ALLRQ I KE VTVHND SDDTL YGSLTNS LN ISEFEESIK 
DYEQANLNMANS I KFS VWVSFFEI YNEYI YDLFVPVSSKFQKRK 
MLRLSQDVKG YS F I KDLQW I QVSDS KEAYRLLKLG I KHQSVAFT 
KLNN AS S RSHS I FTVK I LQI EDS EMSR V I RVS E LS LCDLAGS ER 
TMKTQNEGERLRETGNINTSLLTLGKCINVLKNSEKSKFQQHVP 
FRESKLTHYF/QSFFNGKGKICMIVNISQCYLAYDETLNVLKFS 
A I AQKVC VPDTLN S SQ E KLFG P VKS SQDVS LDSNSNSK I LNVKR 
ATISWEN3LEDLMEDEDLVEELENAEETED/VGETKLLDEDLDK 
TLEENKAFISHEEKRKLLDLIEDLKKKLINEKKEKLTLEFKIRE 
EVTQEFTQYWAQREADFKETLLQERE I LEENAERRLAI FKDLVG 
KCDTREEAAKDICATKVETEEATACLELKFNQIKAELAKTKGEL 
IKTKEELKKRENESDSLIQELETSNKKIITQNQRIKELINIIDQ 
KEDTINE FQNLKSHMENTFKCNDKADTS SL I INNKLICNETVEV 
P KDS KS KI CSERKRVNENELQQDEP PAKKGS I HVS SAITEDQKX 
S EEVRPN I AE IED I R VLQENNEGLRAFLLTI ENELKNEKEEKAE 
LNKQ I VH FQQELSLS E KKNLTLS KE VQQ I QSNYD I AI AELHVQK 
S KNQEQBE KIMKLSNE I ETATRS I TNNVSQ I KLMHTKI DELRTL 
DSVSQISNIDLLNLRDLSNGSEEDNLPNTQLDLLGNDYLVSKQV 
KEYRIQEPNRENSFHSSIEAIWEECKEIVKASSKKSHQIEELEQ 
QIEKLQAEVKGYKDHNNRLKEKEHKNQDDLLKEKETLIQQLKEE 
LQEKNVTLDVQIQHVVEGKRALSELTQGVTCYKAKIKELETILE 
TQKVERSHSAKLEQDILEKESIILKLERNLKEFQEKLQDSVKNT 
KIJLNVKELKLKEEITQLTNNTjQDMKHLIjQIiKEEEEETNRQETEK 
LKEELSASSARTQNMiNADLQRKEEDYADLKEKLTDAKKQIKQV 
QKEVS VMRDEDKLLR I KINELE KKKNQCSQELDMKQR\T IQQLK 

eqlinqkveeaiqqyerackdlnvkekiiedmrmtleeqeqtqv 
eqdqvl\eakleeverlateldrwrvkcndletknnqrsnkehe 
nntdvi^kltni^dei^eseqkynadrkkwleekmmlitqakea 
enirnkemkkyaedrerffkqqnemeiltaqltekdsdlqkwre 
erdql vaale i qlkal i s snvqkdne i eqlkr 1 1 sets kietqi 
md i kp kr i s sadpdklqtepls ts fe i srnki edgswlds cev 
stendqstrfpkpeleiqftplqpnkmavkhpgcttpvtvkipk 

ARKRKSNEMEEDL VKCENKKNATPRTNLKFPI S DDRNS S VKKEQ 
KVA I R PSS KKT YS LRS Q AS 1 1 GVNLAT KKKEGTLQKFGD FLQHS 
PSII/JSKAKKIIETMSSSKLSNVEASKENVSQPKJiAKRKLYTSE 
ISSPIDISGQVILMDQKMKESDHQIIKRRLRTKTAK 


5399 


705 


230 


GPRMAKFLSQDQINEYKECFSLYDKQQRGKIKATDLWVAMRCLG 
ASPTPGEVQRHLQTHGIDGNGELDFSTFLTIMHMQIKQEDPKKE 
ILIiAMLMVDKEKKGYVMASDLRS KLTSLGE KLTHKEV\ DDLFRE 
\ADIEPNGKVKYDEFIHKITSYLDGTY 


5400 


931 


248 


SHCSS GME I P PTNYPASRAALVAQNY INYQOGTPHRVFE VQKVK 
QASMED I PGRGHK YRLKFAVEE 1 1 QKQVKVNCTAE VLYPS TGQE 
TAPEVNFTFBGETGKNPDEEDNTFYQRLKSMKE PLEAQNI \ PDN 
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sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H*Histidine, I=Isoleucine, K=Lyeine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q»Glutamine, R=Arginine, 
S*Serine, T=Threonine, V-Valine, 
W^Tryptophan, Y-Tyrosine, X=Unknovm, *=»Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








FGNVSPEMTLVLHLAWVACGYI I WQNSTEDTWYKMVKI QTVKQV 
QRNDDFIELDYTILLHNIASQEIIPWQMQVLWHPQYGTKVKHNS 
RLPKEVQLE 


5401 


3 


1360 


TGWSYGPTTSLAFLAPRDFPFPPKLLIHPQAWRLSCGAGSMGS 
QAAAEWRNWASWEGSSSLSGCSMGCFKDDRIVFWTWMFSTYFME 
KWAPRQDDMLFYVRRKLAYSGSESGADGRKAAEPEVEVEVYRRD 
S KKLPGLGD PD I DWE E S VCLNL I LQKL D YMVTCAVCTRADGG D I 
HIHKKKSQQVFASPSKHPMDSKGEESKISYPNIFFMIDSF\EE\ 
V FS DMTVG KG EMVCVE L VAS D KTNTFQG V I FQGS I R YEALKKVY 
DNRVSVAARMAQK\MSFGFSKYSNMEF\VR\MKGPQGKGHAEMA 
VSRVSTGDTSPCGTEEDSSPASPMHERVTSFSTPPTPERNNRPA 
FFSPSLKRKVPRNRIAEMKKSHSANDSEEFFREDDGGADLHNAT 
NLRSRSLSGTGRSLVGSWLKLNRADGNFLLYAHLTYVTLPLHRI 
LTDILEVRQKPILMT 


5402 


3445 


1563 


GECFIMAAWQQNDLVFEFASNVMEDERQLGDPAIFPAVXVEHV - 
PGADI LNS YAGLACVEEPNDMI TESS LDVAEEE I IDDDDDDI TL 
WEASCHDGDETIETIEAAEALLNMDSPGPMLDEKRINNNIFSS 
P EDDMWAP VTHVS VTLDG I PEVMETQQ VQEKYADS PGAS S PEQ 
PKRKKGRKTKP PRPDS PATTPN I S VKKKNKDGKGNT I YLWE FLL 
ALLQDKATCP KYIKWTQREKG I FKLVDS KPVSRLWRKHKNKP \D 
MNYEPMGRALRYYYQRGIIAKVEGQRLVYQFKEMPKDL1YINDE 
DPSSSIESSDPSLSSSATSNRNQTSRSRVSSSPGVKGGATTVLK 
PGNS KAAKPKDPVEVAQPSEVLRTVQPTQSP YPTQLFRTVHWQ 
P VQAVPEGEAARTSTMQDETLNS SVQS I R\T I QAPTQVPVWS P 
RNQQ\LHTVTLQrVPLTTVIASTDPSAGTGSQKFILQAIPSSQP 
MTVLKENVMLQSQKAGSPPSIVLGPARV\QQVLTSNVQTICNGT 
VSV\ASSPSFS\ATAPWTLFLLGSSQLVAHPPGTVITSVIKTQ 
ETKTLTQEVEKKESEDHLKENTEKTEQQPQPYVMWSSSNGFTS 
QVAMKQNELLEPNS F 


5403 


3445 


1563 


GECFIMAAWQQNDLVFEFASNVMEDERQLGDPAIFPAVIVEHV 
PGAD I LNSYAGLACVEBPNDMITESSLDVAEEE I I DDDDDD ITL 
TVEAS CHDGDE T 1 E T I E AAEALLNMDS PG PMLDE KR INNN I FS S 
PEDDM WAP VTHVS VTLDG I PEVMETQQVQEKYADSPGASS PEQ 
PKRKKGRKTKP PR PDS PATTPNI S VKKKNKDGKGNTIYLWE FLL 
ALLQDKATCPKY IKWTQREKGI FKLVDS KPVSRLWRKHKNKP \D 
MN YE PMGRALR YYYQRG I LAKVBGQRL VYQFKE M P KDL I Y I NDE 
DPSSS I ESSDPS LS SSATSNRNQTSRSRVSS S PGVKGGATTVLK 
PGNS KAAKPKDPVEVAQ PS EVLRTVQPTQSP YPTQLFRTVHWQ 
PVQAVP EGEAARTS TMQDETLNSSVQS I R\T IQAPTQVP VWS P 
RNQQ \ LHT VTLQTVPLTTVIAS TDPS AGTGSQKF ILQAI PS SQP 
MTVLKENVMLQSQKAGS P PS I VLGPARV\QQVLTSNVQT I CNGT 
VSV\ ASS PS FS \ATAP WTLFLLGSSQLVAHPPGTVI TSVI KTQ 
E TKTLTQ E VEKKE S EDHLKENT E KTEQQ PQ PYVMWS S SNG FTS 
Q VAMKQNE LLE PNS F 


5404 


187 


1111 


LPVTLI FAKMKTLQSTLLLLLLVPLIKPAPPTQQDSRI IYDYGT 
DNFEESIFS QD YE D KY LDGKN I KEKE TV 1 1 PNEKS LQLQKD E A I 
TPLPPKKENDEMPTCLLCVCLSGSVYCEEVDIDAVPPLPKESAY 
LYARFNKIKKLT\AKDFADIPNLRRLDFTGNLIEDIEDGTFSKL 
SLVEELSLAENQLLKLPVLPPKLTLFNAXYNKIKSRGIKANAFK 
IUJNNIjI r LiLDHNAIjE.5 VPLNLtPhSLRVIHLQFNNIASITDDTF 
CKANDTS Y IRDRI EE I RLEGNP I VLGKHPNS FICLKRLPIGS Y F 


5405 


2199 


1220 


QNSRSLHMDPQNQHGSGSSLWIQQPSLDSRPRLDYEREIQPTA 
I LSLDQ I KAIRGSNEYTEGPSWXRPAPRTAPRQEKHERTHE 1 1 
P INVNNNYEHRHTSHLGHAVL PSNARGP ILSRS TS TGS AASSGS 
NSSASSEQGLLGRSPPTRPVPGHRSERAIRTQPKQLIVDDLKGS 
LKEDLTQHKFI CE QCG KCKCG E CTAPRTL P S CLACNRQ CLCS AE 
SMVEYGTCMCLWKGIFYHCSNDDEGDSYSDNPCSCSQSHCCSR 
YLCMGAMSLFLPCLLCYPPAKGCLKLCRRCYDWIHRPGCRCKNS 
NTVYCKLESCPSRGQGKPS 


5406 


279 


2732 


RWRTYNVEGPLTFMDVAIEFCLEEWQCLDTAQQNLYRNVMLENY 
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Predicted end 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid Begraent containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G*Glycine, 
H=Histidine, I=Isoleucine, K-Lyeine, 
L=Leucine , M=Methionine , N«Asparagine , 
P-Proline, Q^Glutamine, R*Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=*possible nucleotide insertion) 








RNLVFLG/ 1 IAVSKPDLITCLEQEKEPWEPMRRHEMVAKPPVMC 
SHFTQDFWPEQHIKDPFQKATLRRYKNCEHKNVHLKKDHKSVDR 
CK VHRGG YNG FNQCL P ATQS K I FL FDKCVKAFHKFSNSNRHK I S 
HTEKKLFKCKECGKSFCMLSHLAQHKIIHTRVNFCKCEKCGKAF 
NCPSIITKHKRINTGEKPYTCEBCGKVFNWSSRLTTHKKNYTRY 
KLYKCEECGKAFNKSS I LTTHKI IRTGEKFYXCKECAKAFNQSS 
NLTEHFCKIHPGEKPYKCEECGKAFNWPSTLTKHKRIHTGEKPYT 
CEECGKAFNQFS^TTHKRIHTA\EKFYKCTECX3EAFSRS \SNL 
T KHKE I HTE KKP YKCE E CG KA FKWS S KI/TEH KLTHTGE KP YKCE 
KCGKAFNCPS 1 1 TKHNR INT3EKP YTCEECOKVFNWSSRLTTHK 
KN YTRY KLYKCE E CX3KAFN KSS I LTTH KKI HIE KKFYKCE ECGK 
AFKWSSKLTEHKITHTGEKPYKCEECGKAFNHFSILTKHKRIHT 
GEKPYKCEECGKAFTQSSNLTTHKKIHTGEKFYKCEECGKAFTQ 
SSNLTTHKKIHTGGKPYKCEECGKAFNQFSTIiTKHKIIHTEEKP 
YKCEECGKAFKWSSTLTKHKIIHTGEKPYKCEECG\KAFKLSST 
LS THK 1 1 HTG E KP YKCE KCGKAFNR P SNL I EH KKI HTGEQ P YKC 
EECGKAFNYSSHLNTHKRIHTKEQPYKCKECGKAFNQYSNLTTH 
NKIHTGEKL YKPEDVTVIIjTTPQTFSNI K 


5407 


3 


659 


RPRRRQSSCCTGWLAGWLLRAAPRFCRRTETDMEQGKGLAVLIL 
AIILLO^TIAQSIKGNHLVKVYDYQEDGSVLLTCDAEAKNITWF 
KDGKMIGFLTEDKKKWNLGSNAKD PRGM YQCKGS QKKSKPLQVY 
YRMCQNCIELNAATISGFLFAEIVSIFDLAVGVYFIAGTGMEFR 
Q S \ RAS D KQTLL P \ ND PAP TQ P LKD PR KMTQ YSHLQGN \ QLRRN 


5408 


2-745 


6128 


QGSKGTCHPQAQQPWDEGVWQEAPSQSEPWGQSQEPPTMPQRLP 
HARQHTP LPLGSADYRRWS VRPQGPHRDPKDSRDAAKREQGS L 
APRPVPASRGGKTLCKGYRQAPPGPPAQFQRPICSASPPWASRF 
STPCPGGAVREDTYPVGTCCTPSIJUiAQGGPOGSWRFLEWKSMP 
RLPTDLDIGGPWFPHYDFERSCWVRAISQEDQLATCWQAEHCGE 
VRNKDMSWPEEMSFIANSSKIDRHKVPTEKGATGLSNLGNTCFM 
NS S I QC VSNTQPL TQYF I SGRHLYE LN RTNP IGMKGHMAKC YGD 
LVQELWSGTQKNVAPLKLRWTIAKYAPRFNGFQQQDSQELLAFL 
LDG LHE DLNR VH E K P YVELKD S DGRPD W E VAAE AWDNHLRRNRS 
IWDLFHGQLRSQVKCKTCGHISVRFDPFNFLSLPLPMDSYMHL 
E I TV I KLDGTT P VR YGLRLNMDE KYTGIjKKQLSDLCGLNS EQI L 
LAEVHGSNIKNFPQDNQKVRIjSVSGFLCAFEIPVPVSPISASSP 
TQTDFSSS PSTNEMFTLTTNGDLPRPI FI PNGMPNTWPCGTEK 
NFTNGMVNGHMPSLPDSPFTGYIIAVHRKMMRTELYFLSSQKNR 

pslfgmplivpctvhtrkkdlydavwiqvsriasplppqeasnh 

AQDCDDSMGYQYPFTIjRWQKDGNSCAWCPWYRFCRGCKIDCGE 
DRAFIGNAYIAVDWHPTALHIiRYQTSQERWDEHESVEQSRRAQ 
VE PINLDS CLRAFTS EEELGENEMYYCS KCKTHCLATKKLDLWR 
LPPILIIHLKRFQFVNGRW1K5QKIVKFPRESFDPSAFLVPRDP 
ALCQHKPLTPQGDELS EPRILARE VKKVDAQS s AGEEDVLLS KS 
PSSLSANIISSPKGSPSSSRKSGTSCPSSKNSSPNSSPRTLGRS 
KGRLRLPQIGS KNKLS S SKENLDAS KENGAGQ I CELADALSRGH 
VLGGSQPELVTPQDHEVALANGFLYEHEACGNGCGNGYSNGQLG 
NHS EE DS TDDQREDTR I KP I YN LYAI S CHSG I LGGGH YVTYAKN 
PNCKWYCYNDSSCKELHPDEIDTDSAYILFYEQQGIDYAQFIiPK 
TDGKKMADTSSMDEDFESDY\EKYCVLQ 


5409 


2745 


6128 


OGSKGTCHPQAQQPWDEGVWQEAPSQSEPWGQSQEPPTMPQRIiP 
HARQHTPLPLGSADYRRWSVRPOGPHRDPKDSRDAAKREQGSL 
APRPVPASRGGKTLCKGYRQAPPGPPAQFQRP I CSASPPWASRF 
STPCPGGAVREDTYPVGTQGVPSIiALAQGGPQGSWRFLEWKSMP 
RLPTDLD I GGPWFPH YD FERS CWVRAI SQEDQLATCWQAEHCGE 
VRNKDMSWPEEMS F I ANSSKI DRHKVPTBKGATGLSNLGNTCFM 
NS S I QCVSNTQPLTQ Y F I SGRHLYELNRTN P I GMKGHMAKCYGD 
LVQELWSGTQKNVAPLKLRWTIAKYAPRFNGFQQQDSQELLAFL 
LDGLHEDLNRVHEKPYVELKDSDGRPDWEVAAEAWDNHLRRNRS 
IWDLFHGQLRSQVKCKTCGHISVRFDPFNFLSLPLPMDSYMHL 
E I TVI KLDGTTP VR YGLRLNMDEKYTGLKKQLS DLCGLNS EQIL 
LAE VHGSN I KNFPQDNQKVRLS VSGFLCAFE I PVP VS P I SASS P 
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Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl a 1 anine , G»Glycine, 
H=Histidine, I-Isoleucine , K-Lyeine, 
L=Leucine, M»Methionine , N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W*Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TQTDFSSSPSTNEMFTLTTNGDLPRPIFIPNGMPNTWPCOTEK 
NFTNGMVNGHMPSLPDS PFTGYI I AVHRKNNRTELyFLSSQKNR 
PSLFGMPLIVPCTVHTRKKDLYDAVW IQVSRLAS PLPPQEASNH 
AQDCDDS MGYQYPFTLRVVQKDGNS CAWCPWYRFCRGCK I DCGE 
DRAF I GNAY I AVDWHPTALHLRYQTSQER WDEHES VEQS RRAQ 
VEPINLDSCLRAFTSEEELGENEMYYCSKCKTHCLATKKLDLWR 
LPPILIIHLKRFQFWGRWIKSQKIVKFPRBSFDPSAFLVPRDP 
ALCQHKPLTPQGDELSEPRIIiAREVKKVDAQSSAGEEDVUjSKS 
PSSLSANIISSPKGSPSSSRKSGTSCPSSKNSSPNSSPRTLGRS 
KGRLRL PQ IGS KNKLS S S KENLDAS KENGAGQ I CELADALS RGH 
VLGGSQPELVTPQDHEVALANGFLYEHEACGNGCGNGYSNGQLG 
NHSEEDS TDDQREDTR I KP I YNLYAI S CHSG I LGGGHYVTYAKN 
PNCKWYCYNDSSCKELH PDE I DTDSAY I LFYEQQG I DYAQFLPK 
TDGKKMADTS S MDE D FE S D Y \ E KY CVLQ 


5410 


2 


710 


LRFPGQARHVWLAARMQAPHKEHL YKLLVIGDLGVGKTS 1 1 KRY 
VHQNFSSHYRATIGVDFALKVLHWDPETWRLQLWDIAGQERFG 
NMTRVYYREAMGAFIVFDVTRPATFEAVAKWKNDLDSKLSLPNG 
KPVSWLLANKCDOjGICDVLMNNGLKNDQFCKEHGFVGWFETSAK 
ENINIDEASRCLVKHILANECDLMES I EPDWKPHLTSTKVASC 
SG\ CAK I LVGTFAGVW 


5411 


1302 


289 


TGPAAAGRRKALGSFGKPSPVTGLRAARRRRTRPSAPAAPSVGC 
GKRRESDAGAGGERASVRTGSGRRGGRTMAGDSEQTLQNHQQPN 
GGEPFL IGVSGGTASGKS S VCAKI VQLLGQNE VDYRQKQ WI IjS 
QDSFYRVLTSEQKAKALKGQFNFDHPDAFDNELILKTLKE I TEG 
XT VQ I P VYDFVSHS R KEETVTVYPADVVLiFEG I LAFYS QE R / IR 
DLFQMKLFVDTDADTRIiSRRVLKDI SERGRDLEQI LSSSTLR FV 
KPA\FEEFCLPPK\KYADVIIPR\GADN\RVPINLIVQHIQ\DI 
LNGGPS\NRQTNGCLNGYTPSRKRQASESSSRPH 


5412 


3180 


313 


QGISNFFHKEANFWFEVS G YL I S PLRS P FVDPALE WSLMAS P WN 
KMEGESSRFEIHTPVSDKKKKKCSIHKERPQKHSHEIFRDSSLV 
NEQSQITRRKKRKKDFQHLISSPLKKSR I CDETANATSTLKKRK 
imRYSALEVDEEAGVTVVLVDKENINNTPKHFRKDVDWCVDMS 
IEQKLPRK\PKTDKFQVLAKSH\AHKSEALHSKVREKKNKKHQR 
KAASWESQRA\RDTLPQSEFPTQEESWLSVGPGGEITELP\ASA 
HKNKSKKKKKKSSNREYET\LAMPEGSQAGREAGTDMQESQPTV 
GLDDETPQIJ^PTHKKKSKKKKKKKSNHQEFESLAMPEGSQVGS 
E VGADMQES \ RPAVGLHGETAG I PAPAYKNKSKKKKKKSNHQE F 
EAVAMPESLESAYPEGSQVGSEVGTVEGSTAUCGFJCESNSTKKK 
SKKRKLTSVKRARVSGDDFSVPSKNSESTLFDSVEGDGAMMEEG 
VKS RPRQKKTQACLAS KHVQEAPRLE PANEEHNVETAEDSE I R Y 
LS ADSGDADDSDADLGSAVKQLQEFIPNI KDRATST I KRMYRDD 
L E R FKEFKAQGVAI KFGKFS VKENKQLE KNVEDFLALTG IE SAD 
KLLYTDRYPEEKSVITNLKRRYSFRLHIG\RNIARPWKLIYYRA 
KKMFDVNNYKGRYSEGDTE KLKMYHSLLGHDWKT I GEMVARRS L 
S VALKFSQ I S SQRNRGAWS KSETRKL I KAVEE VI LKKMS PQBLK 
EVDSKLQENPESCLS I VRE KLYKGI S WVEVEAKVQ TRNWMQ C KS 
KWTE IL TKRMTNGRR I YYGMNALRAKVS LIERLYE INVEDTNE I 
DWEDIjASAIGDVPPSYVQTKFSRLKAVYVPFWQKKTFPEIIDYL 
YETTLPLLKE KLE KMME KKGTKI QTPAAPKQVFP FRDI FYYEDD 
S EGGGHRKRKRRPRRHAWFTPVI PVLWEAKAGWII 


5413 


3 753 


1304 


R F P AGVAPRRAMANVS KKVS W SGRDRDDEEAAPLLRRTARPGGG 
TPLLN3AGPGAARQSPRSALFRVGKMSSVKLDDELLEP\DMDPP 
HP F PKEI PHNEKLLS LKYES LDYDNSENQLFLEEERRI NHTAFR 
TVEIKRWVICALIGILTGLVACFIDIWENLAGLKYRVIKGNID 
KFTE KGG LS FS LLL WAT LNAAFVL VGS V I VAF I EP VAAG S G I P Q 
I KCFLNGVKI PHWRLKTLVI KVS GVI LS WGGLAVGKEGPM I H 
SGSVIAAGISQGRSTSLKRDFKIFEYLRRDTEKRDFVSAGAAAG 
VSAAFGAPVGGVLFSLEEGASFWNQFLTWRIFFASMISTFTLNF 
VLS I YHGNMWDLSS PGL INFGRFDS EKMAYTI HEI PVFI AMGW 
GG VLGAV FNALN YWLTM FR I R Y I H RP CLQ V I E AVL VAAVTATVA 
FVL I YSS RDCQ PLQGGSMS YPLQLFCADGEYNSMAAAFFNTPEK 
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Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F»Phenyl alanine, G-Glycine, 
H=Histidine, I=Isoleucine, K*Lysine, 
L=Leucine, FUMethionine , N-Asparagine , 
P- Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X=UnXnown, *t=stop 
Codon, /^possible nucleotide deletion, 
\s=possible nucleotide insertion) 








SWSLFHDPPGSYNPLTLiGLFTLVYFFLACWTYGLTVSAGVFIP 
SLL I GAAWGRLFG I S LS YLTG AA I WADPG KYALMGAAAQ LGG I V 
RMTLS LTVI MME ATS NVTYGF P I ML VLMTAX I VGDVF I EG L YDM 
H I QLQS VPFLHWEAP VTSHSLTARE VMSTP VTCLRRRE KVGVI V 
DVLSDTASNHNGFPWEHADDTQPARLQGLI LRSQL I VLLKHKV 
FVERSNLCLVQRRLRLKDFRDAYPRFPPIQSIHVSQDERECTMD 
LS E FMNPS P YTVP QEAS L PR VP KL FRALGLRHLVVVDNRNQ VVG 
LVTRKDLARYRLGKRGLEELSLAQT 


5414 


2130 


390 


GVASAWDRALFS PLLS PTSRVFRTSP PRCVSTETGRRDRARVPS 
QWCSVLQGKLPVSGRTSLACVRSILLSPASSPRKVGIVGGTGAR 
AGAAPRDHGRVRHRRPS SARRMTRTTGQCLAPRGCQGPRGTRS P 
RS PRSRTRRGCS AS PACLP / CRS AL I VAVLCY INLLNYMDRFTV 
AG VLPD I EQ F FN I GDS S SGLIQTVF I SS YMVLAPVFG YLGDRYN 
RKY LMCGG I AFWS L VTI»G S S F I PG EHFWLLLLTRGLVG VG EASY 
ST1 APTLI ADLFVADQRSRMLS I FYFAI P VGSGLGYI AGS KVKD 
MAGDWHWALR VTPGLG WA VLLLFL WRE P P RGAVERKS DLPPL 
NPTS WW ADLRALARN P S FVL S S LG FTAVAFVTG SLALW AP AFLL 
RS RWLGETP PCL PGDS CS S SDSL I FGL ITCLTGVLiGVGLG VE I 
SRRLRHSNPRAD PLVCATGLLGSAPFLFLSLACARGS IVATYIF 
I FIGETLLSMNWAI VAD ILL YWI PTRRSTAEAFQI VLSHLLGD 
AG S P YL I GL I S DRLRRNW P P S FI>S E FRALQFS LMLCAF VGALGG 
AAFLGTAHLH 


5415 


693 


2986 


I P P KTKLE LQKH \ LTT LT \ NQ E Q AT I FE E VQ KLRP RNEQR ENEL 
IISFLRCLFEBKQKEHIHIGEMKQTSQMAAENIGSELPPSATRF 
RLDMLKNXAKRSLTESLESILSRGNKARGLQBHSISVDLDSSLS 
STLSNTSKEPSVCEKEALPISESSFKLLGSSEDLSSDSESHLPE 
EPAPLSPQQAFRRRANTLSHFPIECQEPPQPARGSPGVSQRKLM 
RYHSVSTETPHERKDFESKANHLGDSGGTPVKTRRHSWRQQIFL 
RVATPQKACDSSSRYEDYSELGELPPRSPLEPVCEDGPFGPPPE 
EKKRTSRELRELWQKAILQQI LLLRMEKENQKLQASENDLLNKR 
LKLDYEE ITPCLKEVTTVWEKMLSTPGRS KI KFDMEKMHSAVGQ 
GVP\RHHRGEIWKFLAEQFHLKHQFPSKQQPKDVPYKELLKQLT 
SQQHAILIDLGRTFPTHPYFSAQLGAGQLSLYNILKAYSLLDQE 
VG YCQGLS FVAG I LLLHMSEEEAFKMLKFLMFDMGLRKQ YRPDM 
IILQIQMYQLSRLLHDYHRDLYNHLEEHEIGPSLYAAPWFLTMF 
ASQFPLG FVAR VFDM I FLQGTE VI FKVALSLLGSHKPLILQHEN 
LETIVDFIKSTLPNLGLVQMEKTINQVFEMDIAKQLQAYEVEYH 
VLQEELIDSSPLSDNQRMDKLEKTNSSLRKQNLDLLEQLQVANG 
RIQSLEATIEKLLSSESKLKQAMLTLELERSALLQTVEELRRRS 
AKPSDREPECTQPEPTGD 


5416 


27 


4074 


KSQLFCFWGGKAGDILSGDQDKEQKDPYFVETPYGYQLDIiDFLK 
YVDDIQKGNTI KRLN I QKRRKPS VPCPE PRTTSGQQG I WTSTES 
LSSSNSDDNKQCPNFLIARSQVTSTPISKPPPPLETSLPFLTIP 
ENRQLPPPS PQL P KHNLHVT KTIjMETRRRLEQERATMQMTPG E F 
RR PRLlAS FGGMGTTS SLPS FVGSGNHNPAKHQLQNG yqqngd yg 
S Y AP AAPTTS SMG S S I RHS PLSSGIST P VTNVS FMHLQHI REQM 
A I ALKRLKELEEQ VRT I PVLQ VKI S VLQEE KRQLVSQLKNORAA 
SQINVCGVRKRSYSAGNASQLEQLSRARRSGGELYIDYEEEEME 
TVEQSTQRIKEFRQL\TADMQALEQKIQDSSCEASSELRENGEC 
RS VAVGAEENMND I WYHRGS RS CKDAAVGTLVEMRNCGVSVTE 
AMLGVMTEADKE I ELQQQTI ESLKEKI YRLEVQLRETTHDREMT 
KL KQ E LQ AAGS R K KV D KATMAQ P LVFS KW EA WQTR DQM VG S H 
MDLVDTCVGTSVETNSVGISCQPECKNKWGPELPMNWWIVKER 
VEMHDRCAGRSVEMCDKSVSVEVSVCETGSNTEESVNDLTLLKT 
NLNLKEVRS IGCGDCSVDVTVCS PKECAS RGVNTEAVS QVEAAV 
^VPRTADQDTSTDLEQVHQFTKTETATLIESCTNTCLSTLDKQ 
TSTQ TVETRTVAVGEGR VKD INS STKTRS IG VGTLLSGHS GFDR 
PSAVKTKESGVGQININDNYLVGLKMRTIACGPPQLTVGLTASR 
RSVGVGDDPVGESLENPQPQAPLGMMTGLDHYIERIQKLIiAEQQ 
TL1AENYSE1AEAFGEPHSG«GSLNSQLISTLSSINSVMKSAST 
EELRNPDFQKTSLGKITGSYLGYTCKCGGLQSGSPLSSQTSQPE 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, l=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine, 
P- Proline, Q=Glut amine, R-Arginine, 
S-Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, XsUnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QE VGTS EG KP I SS LDAF PTQ EGTLS P VNLTDDQ IAAGL YACTNN 
ESTLKSIMKXKDGNKDSNGAKKNLQFVGINGGYETTSSDDSSSD 
ESSSSESDDECDVIEYPLEEEEEEEDEDTRGMAEGHHAVNIEGL 
KSARVEDEMQVQECEPEKVEIRERYELSEKMLSACNLLKNTIND 
PKALTSKDMRFCLNTLQHEWFRVSSQKSAI PAMVGD YI AAFEA I 
SPDVLRYVINLADGNGNTALHYSVSHSNFEIVKLLLDADVCNVD 
HQNKAGYTP I MLAALAAVEAE KDMRIVE ELFGCGDVNAKASQAG 
QTALMLAVSHGRIDMVKGLLACGADVNIQDDEGSTALMCASEHG 
HVE IVKLLLAQPGCNGHLEDNDGSTALS IALEAGHKDIAVLLYA 
HVNFAKAQS PGTPRLGRKTS PGPTHRGS FD 


5417 


27 


4074 


KSQLFCFWGGKAGDILSGDQDKEQKDPYFVETPYGYQLDLDFLK 
YVDDIQKGNTIKRLNIQKRRKPSVPCPEPRTTSGQQGIWTSTES 
LSSSNSDDNKQCPNFLIARSQVTSTPISKPPPPLETSLPFLTIP 
ENRQLPPPS PQLPKHNLH VTKTLMETRRRLEQERATMQMTPGE F 
RRPRLAS FGGMGTTS SLPS FVGSGNHNPAKHQLQNGYQGNGDYG 
S Y AP AAP TTS S MGS SIRHSPLSSGI ST P VTNVS PMHLQH I REQM 
AIALKRLKELEEQVRTIPVLQVKISVLQEEKRQLVSQLKNQRAA 
SQINVCGVRKRSYSAGNASQLEQLSRARRSGGELYIDYEEEEME 
TVEQSTQRIKEFRQL\TADMQALEQKIQDSSCEASSELRENGEC 
RS VAVGAEENMNDI WYHRGSRS CKDAAVGTL VEMRNCG VS VTE 
AMLG VMTBAD KE I E LQQQT IESLKEKI YRLE VQLRETTHDREMT 
KLKQE LQAAG S RKKVD KATMAQ P L VFS KWE AWQTRDQMVGSH 
MDLVDTCVGTSVETNSVGISCQPECKNKVVGPELPMNWWIVKER 
VEMHDRCAGRSVEMCDKSVSVEVSVCETGSNTEESVNDLTLLKT 
NLNLKEVRS IGCGDCS VDVTVCS PKECASRGVNTEAVSQVEAAV 
MAVPRTADQDTS TDLEQVHQFTNTETATL I ES CTNTCLS TLDKQ 
TSTQTVETRTVAVGEGRVKD INS STKTRS IGVGTLLSGHSG FDR 
PSAVKTKESGVGOJNINDNYLVGLKMRTIACGPPQLTVGLTASR 
RSVGVGDDPVGESLENPQPQAPLGMMTGLDHYIERIQKLLAEQQ 
TLLAENYSELAEAFGEPHSQMGSLNSQLISTLSSINSVMKSAST 
EELRNPDFQKTSLGKITGSYLGYTCKCGGLQSGSPLSSQTSQPE 
QBVGTSEGKP I S S LDAFPTQEGTLSPVNLTDDQI AAGL YACTNN 
ESTLKS IMKKKDGNKDSNGAKKNLQFVG INGG YETTSSDDS S SD 
ESSSSESDDECDVIEYPLEEEEEEEDEDTRGMAEGHHAVNIEGL 
KSARVEDEMQVQECEPEKVEIRERYELSEKMLSACNLLKNTIND 
PKALTSKDMRFCLNTLQHEWFRVSSQKSAI PAMVGDY IAAFEAI 
SPDVLRYVINLADGNGNTALHYSVSHSNFEIVKLLLDADVCNVD 
HQNKAG YT P I M LAALAAVEAEKDMR I VEE LFG CGD VNAKAS QAG 
QTALM LAVS HGR I DMVKG LLACGAD VNIQDDEGS TALMCAS EHG 
HVEIVKLUoAQPGCNGHLEDNDGSTALSIALEAGHKDIAVLLYA 
HVNFAKAQS PGTPRLGRKTS PGPTHRGSFD 


5418 


24 


1133 


SVPRAGGDMETGAAELYDQALLG I LQHVGNVQDFLRVLFGFLYR 
KTDFYRLLRHPSDRMGFPPGAAQALVLQVFKTFDHMARQDDEKR 
RQELEEKIRRKEEEEAKTVSAAAAEKEPVPVPVQEIEIDSTTEL 
DGHQEVEKVQPPGPVKEMAHGSQEAEAPGAVAGAAEVPR\EPPI 
LPRIQEQFQKNPDSYNGAVRENYTWSQDYTDLEVRVPVPKHWK 
GKQ VS VALS S S S I R VAML E ENGER VLMEG KLTHKI NTES S L WS L 
EPGKCVLVNLSKVGEYWWNAILEGEEPIDIDKINKERSMATVDE 
EEQAVLDRLTFDYHQKLC^KPQSHELKVHEMLKKGWDAEGSPFR 
GQRFDPAMFNIS PGAVQF 


5419 


1395 


259 


GTHP LD P D LVS RTS VQG P LM TMAC PGM S DTE E S P FLG PRAAE EG 
SESEACEAFGRRKSEEEGRRSDTSGFGRSPJOIKVNWKHPERADA 
KDPASLPQC/IX3P/DCVRPAQPSSKYCSDDCGMKLAANRIYEIL 
PQRI QQWQQS P C I AEEHGKKLLER I RREQQSARTRLQEMERRFH 
ELEAI I LRAKQQAVREDEESNEGDSDDTDLQIFCVSCGHP INPR 
VALRHMERCYAKYESQTS FGS M YPTR I EGATRLFCD VYNPQS KT 
YCKRLQVLCPEHSRDPKVPADEVCGCPLVRDVFELTGDFCRLPK 
RQCNRHYCWEKLRRAEVDLERVRVWYKLDELFEQERNVRTAMTN 
RAGLLALMLHQTIQHDPLTTDLRSSADR 


5420 


117 


1733 


NEAGGAC P FKGG ASGRLYLS PRLPRVS VAGCEER PLG WVWVLGG 
GGFL PARP PRAQRHLGFSHAEQSMEAPD YEVLSVREQLFHER I R 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A*=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K*Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P-Proline, Q-Glutamine , R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\*possible nucleotide insertion) 








ECI ISTLLFATLYILCHIFI J THFK'lfDaT?PTT\r , MMVMD'DCT'Df / ~ 

LLELCTFTLAI ALGAVLLLPFS I ISNEVLLSLPRNYYIQWLNGS 

LIHGLWNLVFLFSNLSLIFLMPFAYFFTESEGFAGSRKGVLGRV 

YETVVMI^LLTLLVIX3MVWVASAIVX)KNKANRESLYDFWEYYLP 

YLYSCISF LG VLLLLVCT P LGLARM FS VTG KLL VKPRLLE DLEE 

QLYCSAFEEAALTRRI CN PTSCWLP LDMELLHRQVLALQTQRVL 

LEKRRKASAWQRNLGYPLAMLCLLVLTGLSVLIVAIHILELLID 

EAAMPRGNC^TSI>GQVSFSKIiGSFGAVIOVVLIFYLMVSSVVGF 

YS S P L FRS LR PRWHDTAMTQ 1 1 GNCVCLL VLSS ALP VFS RTLGL 

TRFDLLGDFGRFNWLGNFYIVFLYNAAFAGLTTLCLVKTFTAAV 

RAELIRAFGERE 


5421 


117 


1733 


NE AGG ACP F KGGAS GRL YLS PRL PR VS VAGC EER P LGWVW VTiGG 
GGFLPARPPRAQRHLGFSHAEQSMEAPDYEVLSVREQLFHERIR 
1£> i LiLif Al JjY iijCHIFliTRFKKPAEFTT\GMMKNPPSTRL/ 
LLELCTFTLAI ALGAVLLLPFS 1 1 SNEVLLSLPRNYYIQWLKfGS 
LIHGL WNL VFL FSKLS L I FLMPFA YFFTESEG FAG S RKG VLGR V 
YET\miL>OiLTLLVU3MVWVASAIVDKNKANRESLYDFWEYYLP 
YLYS C r S FLGVLLLLVCTPLGLARMFS VTGKLLVKPRLLEDLEE 
QLYCSAFEEAALTRRICNPTSCWLPLDMELIiHRQVLALQTQRVL 
LEKRRKASAWQRNLGYPLAMLCLLVLTGLSVLIVAIHILELL I D 
E AAM P RGMQGTS LGQ VS FS KLGS FGAVI QWL I FY LMVSS WGF 
YSSPLFRSLRP RWHDTAMTQ 1 1 GNC VCLLVLS S AL P VFSRTLGL 
TRFDLLGDFGRFNWLGNFYIVFLYNAAFAGLTTLCLVKTFTAAV 


5422 


3 


1263 


SCGBS LPTWLAGASRPG IGRKGGAWGGRGGS S PAQVLLS PGP VF 
KAGCNWWHLSRDQAG VQRCDLGS SQP P PLGFKRFS CLSLPS SWD 
YRSTVLCVSKMEADLSGFNIDAPRWDQRTFLGRVKHFLNITDPR 
TVFVSERELDWAKVMVEKSRMGVVPPGTQVEQLLYAKKLYDSAF 
H PDTG E KMNV 3 G RMS FQL PGGM 1 1 TGFMLQ FYRTMPAV I FWQ WV 
NQS FNALVNYTNRNAAS PTS VRQMALS Y FTATTTAVATAVGMNM 
LTKKAPPLVGRWVPFAAVAAANCVNIPMMRQQELI KGICVKDRN 
ENEIGHSRRAAAIGITQWISRITMSAPGMILLPVIMERLEKLH 
FMQKVKVL / SAPLQ VMLS G CFL I FM VP VACGLF PQKCELP VS YL 
EPKLQDTIKAKYGELEPYVYFNKGL 


5423 


3186 


905 


GVSMALGEEKAEAEASEDTKAQSYGRGSCRERELDIPGPMSGEQ 
PPRLHAEGGLISPVWGAEGIPAPTCWIGTDPGGPSRAHQPQASD 
ANRE P VAERSE PALS GLP PATMGSGDLLLSGES QVEKTKLSS SE 
EFPQTLSLPRTTICSGHDADTEDDPSLADLPQALDLSQQPHSSG 
LSCLSQWKSVLSPGSAAQPSSCSISASSTGSSLQGHQERAEPRG 
GSLAKVSSSLEPWPQEPSSWGLGPRPQWSPQPVFSGGDASGL 
GRRRLS FQAEYWACVLPDSLPPS PDRHS PLWNPNKE YEDLLD YT 
YPIjRPGPOTiPKP'Tir)^p\rpDr>p\n'.riric:n\mT hcitcvcdrctt iron 

TNVS PNC P PAEATALP FSG PRE P S LKQW PS RVPQKQGGMGLAS W 
SQIJ^TPPJ^PGSRDARWEJIREPALRGAKDRLTIGKHLDMGSPQL 
RTRDRGWPSPRPEREKRTSQSARRPTCTESRWKSEEEVESDDEY 
LALPARLTQVSSLVSYLGSISTLVTLPTGDIKGQSPLEVSDSDG 
PASFPSSSSQSQLPPGAALQGSGDPEGQNPCFLRSFVRAHDSAG 
EGSLGSSQALGVSSGLLKTRPSLPARLDRWPFSDPDVEGQLPRK 
GGEQGKESLVQC\VKTFC\CQLEELICWLYNV\ADVTDHGTPAR 
S NLTS LK\ S S LQL YRQFKKD I DEHQS LTE S VLQKG E I LLQCLLE 
NTPVLEDVLGRIAKQSGELESHADRLYDSILASLDMLAGCTLIP 
DKKPMAAMEHPCEGV 


5424 


3186 


905 


GVSMALGEEKAEAEASEDTKAQSYGRGSCRERELDIPGPMSGEQ 
P P RLEAEGG LIS P VWGAEG I PAPT CW I GTD PGGPS RAHQPQAS D 
ANRE P VAERS E P ALSG L PP ATMGSGDLLLS GE SQVE KTKLS S S E 
E FPQTLSLPRTTI CSGHDADTEDDPSLADLPQALDLSQQPHSSG 
LSCLSQWKSVLSPGSAAQPSSCSISASSTGSSLQGHQERAEPRG 
GSLAKVSSSLEPVVPQEPSSVVGLGPRPQWSPQPVFSGGnASGL 
GRRRLS FQAEYWACVLPDSLPPS PDRHS PLWNPNKEYEDLLDYT 
YPLRPGPQLP KHLDS RVPADP VLQDSGVDLDS FS VS PAS TLKS P 
TNVSPNCPPAEATALPFSGPREPSLKQWPSRVPQKQGGMGLASW 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, 
Glutamic Acid. F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine / M-Methionine , N=Aeparagine, 
P«Proline, Q-Glutamine, R-Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X=UnJcnown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SQIASTPRAPGSRDARWERREPALRGAKDRLTIGKHLDMGSPQL 
RTRDRGWPSPRPEREKRTSQSARRPTCTESRWKSEEEVESDDEY 
LALPARLTQVS S LVS YLGS I STLVTLPTGD I KGQS PLE VS DSDG 
PASFPSSSSQSQLPPGAALQGSGDPEGQNPCFLRSFVRAHDSAG 
EGSLGSSQALGVSSGLLKTRPSLPARLDRWPFSDPDVEGQLPRK 
GGEQGKESLVQC\VKTFC\CQLEELICWLYNV\ADVTDHGTPAR 
S NLTS LK\ S S LQ L YRQ FKKD I D EHQS LTES VLQ KGE I L LQ CLLE 
NTP VLEDVLGR I AKQSGELESHADRL YDS I LAS LDMLAGCTLI P 
DKKPMAAMEHPCEGV 


5425 


1086 


115 


GFCPS PSLGHQP PRVLHPTMSMAVETFGFFMATVGLLMLGVTLP 
NS YWRVSTVHGNVITTNTI FENLW FS CATDS LGVYNCWEFPSML 
ALSGYIQACRALMITAILLGFLGLLTjGIAGLRCTNIGGLELSRK 
AKLAATAGAPH \ I LPG I CGMVAI \ S W YAFNI TR\ DFSDPL YPGT 
KYELGPALYLGWSASLIS ILGGLCLCS ACCCGSDEDPAAS ARRP 
YQAPVSVMPVATSDQEGDSSFGKYGRNALRVAALCRGPRCLPTA 
PKKRGPGRGPFPYSNLRGRPRPVPVAPPRPRPRVLHSHGPSQAK 
NCSWEVAYLPSEAGSLIF 


5426 


42 


3435 


ATSSQSLGRADPPRGGTMERSPGEGPSPSPMDQPSAPSDPTDQP 
PAAHAKPDPGSGGQPAGPGAAGEAiAVLTS FGRRLLVL I P VYLA 
GAVGLSVGFVLFGIiALYLGWRRVRDEKERSLRAARQLLDDEEQL 
TAKTLYMSHRELP AWVS FPDVE KAEWLNKI VAQVWPFLGO YMEK 
LIiAETVAPAVRGSNPHLQTFTFTRVELGEKPLRIIGVKVHPGQR 
KEQ I LLDLN I S YVGDVQI DVEVKKYFCKAGVKGMQLHG VLRVIL 
E PL I G DLP F VGAVS MFF 1 RRPTLD I NWTGMTNLLD I PG LS S LSD 
TM I MDS I AAFLVLPNRLLVPLVPDLQDVAQLRS PLPRG 1 1 R IHL 
LAARGLSSKDKYVKGLIEGKSDPYALVRLGTQTFCSRV1DEELN 
PQ WGETYEVMVHE VPGQE I EVEVFDKDPDKDDFLGRMKLD VGKV 
LQASVLDDWFPLOGGQGQVHLRLEWLSLLSDAEKLEQVLQWNWG 
VSS RPDP PSAAI L WYLDRAQDLPMVTS EL YP PQLKKGNKEPNP 
MVQLSIQDVTQESKAVYSTNCPVWEEAFRFFLQDPQSQELDVQV 
KDDSRALTLGALTLPLARLLTAPELILDQWFQLSSSGPNSRLYM 
KLVMRILYLDSSEICFPTVPGCPGAWDVDSENPQRGSSVDAPPR 
PCHTTPDSQFGTEHVLRIHVLEAQDLIAKDRFLGGLVKGKSDPY 
VKLKLAGRSFRSHWREDLNPRWNEVFEVIVTSVPGQELEVEVF 
DKDLDKDDFIX3RCKVRLTTVLNSGFLDEWLTLBDVPSGRLHLRL 
ERLTPRPTAAELEEVLQVNSLIQTQKSAELAAALLS I YMERAED 
LPLRKGTKHLS P YATLTVGDSSHKTKTI SQTSAPVWDES AS FLI 
RKPHTESLELQVRGEGTGVLGSLSLPLSELLVADQLCLDRWFTL 
S SGQGQ VLLRAQLG I LVSQHSG VEAHSHS YSHS S S SLSEEPELS 
GG P PH I TS S AP E V\ RQRLTHVDS P LEAPAGPLGQVKLTLW YYS E 
ERKLVSIVHGCRSLRQNGRDPPDPYVSLLLLPDKNRGTKRRTSQ 
KKRTLSPEFNERFEWELPLDEAQRRKLDVSVKSNSSFMSREREL 
I^KVQLDLAETDLSQGVARWYDLMDNKDKGSS 


5427 


42 


3435 


ATSSQSLGRADPPRGGTMERSPGEGPSPSPMDQPSAPSDPTDQP 
PAAHAKPDPGSGGQPAGPGAAGEALAVLTSFGRRLLVLIPVYLA 
GAVGLSVGFVLF^IALYLGWRRVRDEKERSLRAARQLLDDEEQL 
TAKTLYMSHRELPAWVSFPDVEKAEMLNKIVAQVWPFLGQYMEK 
LLAETVAPAVRG SN P HLQTFT FT RVELG E KPLR 1 1 GV KVHPGQR 
KEQILLDLNISYVGDVQIDVEVKKYFCKAGVKGMQLHGVLRVIL 
EPLIGDLPFVGAVSMFFIRRPTLDINWTGMTNLLDIPGL5SLSD 
TMIMDSIAAFLVLPNRLLVPLVPDLQDVAQLRSPLPRGIIRIHL 
LAARGLSSKDKYVKGLIEGKSDPYALVRLGTQTFCSRVIDEELN 
PQWGETYEVMVHE VPGQE I EVEVFDKDPDKDDFLGRMKLDVGKV 
LQASVLDDWFPLOGGQ^QVHIiRXiEWLSLLSDAEKLEQVLQWNWG 
VSSRPDPPSAAILWYLDRAQDLPMVTSELYPPQLKKGNKEPNP 
MVQLSIQDVTQESKAVYSTNCPVWEEAFRFFLiQDPQSQELDVQV 
KDDSRALTLGALTLPLARLLTAPELILDQWFQLSSSGPNSRLYM 
KLVMRI LYLDSSE I CF P TVPGC PG AWD VDS ENPQRGS S VDAPPR 
panTPDSQFGTBKVLRIHVLEAQDLIAKDRFLGGLVXGKSDPY 
VKLKLAGRSFRSHVVREDLNPRWNEVFEVIVTSVPGQELEVEVF 
DKDLDKDDFLGRCKVRIjTTA^LNSGFLDEWLTLEDVPSGRIjHLRL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A*Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutarnic Acid, ^Phenylalanine, G«Glycine, 
H=Histidine, I^Isoleucine, K-Lysine, 
L=Leucine, M*Methionine, N«Asparagine , 
P-Proline, Q=Glutamine, R-Arginine, 
S^Serine, ^Threonine, V= Valine, 
W= Tryptophan , Y=Tvros ine . XsUnlrnrmn *_ct-^^. 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ERLTPRPTAAELEEVLQVNSLIQTQKSAElJWVLLSiyMERAED 
LPLRKGTKHLSPYATLTVGDSSHKTfCTISQTSAPVWDESASFLI 
RKPHTESLELQVRGEGTGVIX3SLSLPLSELLVADQLCLDRWFTL 
SSGOGQVLLRAQLGILVSQHSGVEAHSHSYSHSSSSLSEEPELS 
GGPPHITSSAPEV\RQRLTHVDSPLEAPAGPLGQVKLTLWYYSE 
ERKLVSIVHGCRSLRQNGRDPPDPYVSLLLLPDKNRGTKRRTSQ 
KKRTLSPEFNERFEWELPLDEAQRRKLDVSVKSNSSFMSREREL 
LGKVQLDLAETDLSQGVARWYDLMDNKDKGSS 


5428 


3 


1839 


S S RSER LSA CA I AP P WLVS SRPARP AQLQRPG JCMVEDGAEE LED 
LVHFSVSELPSRGYGVMEEIRRQGKLCDVTLKIGDHKFSAHRIV 
LAASIPYPHAMFTNDMMECKQDEIVMQGMDPSALEALINFAYNG 
NLAI DQQNVQS LLMGAS FLQLQS I KDACCTFLRERLH PKNCLGV 
i\w r nr. i ru'iv^ v Ln uaaw £»f I HUn r VE VSMSEE FLALPLEDVLEL 
VS RD ELNVKS E EQ VFEAALAWVR YDREQRGTFL \ RNLQS N I RLL 
FCRPQFLSDRVQQDDLVRCCHKCRDLVDEAKDYLLMPERRPHLP 
AFRTR PRCCTS I AGL I Y AVGGLN S AGD SLNWEVFDP I ANCWER 

CRPMTTARSRVGVAWNGLLYAIGGYDGyLRLSTVQAYNTETDT 
» vjoi-inoivr.ortrjij I v vjjLajU-I i VCGGYDGNSSLSSVETYSPE 
TDKWTWTSMSSNRSAA\GVTVFEGRIYVSGGHDGLQIFSSVEH 
YNHHT ATWH P AAGMLNKR CRHGAAS LG S KMFVCGG YDG SG FLS 1 
AEMYSSV\ADQWCLIVPM\HTRR \ SRVSLGGPAVGRLYAVWGVT 
TGQS N L\ S S VG D VLTPE TDCWTFM \ APMACHEGG VG VG CI PLLT 


5429 


823 


202 


RREDALSSEGCLWPSESTVSGNGIPEPQVYAPPRPTDRLAVPPF 
AQRER FHRFQP TYP YLQHE I DLPP T 1 S LSDGE EP P P YQGP CTLQ 

LRDPEQQLELNRESVRAPPNRTIFDSDLMDSARLGGPCPPSSNS 

G I S ATCYGS GGRM ER P P P \ T V c: t?\t t P u v or c c ornrrw^o o n -n ™ «-» t 
M*^wv,j. WkJ vjvjivi lcunr r ±r \ i lo&v lVjrix FlaijbryHQQSSGPPSL 

LEGTRI^HTHIAPLESAAIWSKEKDKQKGHPL 


5430 


441 


1507 


gW^KRRRIOCIMKTIQPKMHNSISWAIFTGIAALCLFQGVPVRS 
G DAT F P KAMDNVTVRQG E S ATLRCT I DNR VTR VAW LN RS T I L YA 

GNDKWCLDPRWLLSOTQTQYSIEIQNVDVYDEGPYTCSVQTDN 
HPKTSRVHLIVQVSPKIVEISSDISINEGNNISLTCIATGRPEP 
TVTWRHISPKAVGFVSEDEYLEIQGITREQSGDYECSASNDV\A 
APV\ VRR VKVTVNYPP YIS EAKGTGVPVGQKGTLQCEASAVPSA 
EFQWYKDDKRLI / EGKKGVKVENRP FLS KL I FFNVS EHDYGNYT 
CVASNKLGHTNAS IMLFGPGAVS EVSNGTS RRAGCVWLLPLLVL 
HLLLKF 


5431 


2 


1312 r 


AAAAPGSRRRRPLPDRPHMAHGYEAPPPPAPRSPAWRARSKPV\" 
LPGITINP\TIAEGPSP\TSEGASEANLVDLQKKLEELELDEOQ 
KKRLEAFLTQKAKVGELKDDDFERISELGAGNGGWTKVQHRPS 
GLIMARKLIHLEIKPAIRNQIIRELQVLHECNSPYIVGFYGAFY 
S DGE I S I CMEHMDGGS LDQVLKEAKRI PEE ILGKVS IAVLRGLA 
YLREKHQ IMHRDVKPSNILVNS RGEIKLCDFGVSGQL I DSMANS 
FVGTRSYMAPERLQGTHYSVQSDIWSMGZjSLVELAVGRYPIPPP 
DAKELEAI FGRPWDGEEGEPHS ISPRPRPPGRPVSGHGMDSRP 
AMAI FE LLD Y I VNE P P PKLPNGVFTPDFQE FVNKCL I KNPAERA 
DLKMLTNHTF I KRS E VEEVDFAGWL CKTT .R LWnPr; t dtp <rn i / 


5432 


2 


1312 


AAAAPGSRIUUPLPDRPHMAHGYEAPPPPAPRSPAWI^SKPVV" 
LPGITINP\TIAEGPSP\TSEGASEANLVDLQKKLEELELDEQQ 
KKRiEAFLTQKAKVGELKDDDFERISEUSAGNGGVVTKVQHRPS 
GLIMARKLIHLEIKPAIRNQI1RELQVLHECNSPYIVGFYGAFY 
SDGEISI CMEHMDGGS LDQVLKEAKRI PEE ILGKVS IAVLRGLA 
YL REKHQ 1 MHRD VKPS N" I L VNSRG E I PCLCD FG VS GQL I D S MANS 
FVGTRSYMAPERLQGTHYSVaSDIWSMGLSLVELAVGRYPIPPP 
DAKELEAI FGRPWDGEEGEPHS ISPRPRPPGRPVSGHGMDSRP 
AMA I FELLDY I VNEP PP KLPNGVFTPDFQEFVNKCL I KNPAERA 
DLKMLTNHTFIKRSEVEEVDFAGWLCKTLRLNQPGTPTRTAV 


5433 


360 


1885 


SVQEDKVGFEDPLHLCSWRARACPCTWPHC/CTGLLECLGFAGV 
L FG W P S LVFVF KNED YF KDL CGPDAG P I GNATGQADC KAQD E R F 
SLIFTI.GSFMNNFMTFPTGYIFDRFKTTVARLIAIFFYTTATLI 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*Alanine, OCysteine, D«Aspartic Acid, E= 
Glutamic Acid, F-Phenyl alanine , G=Glycine, 
H»Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=:Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








I AFTSAGS AVLLFLAMPMLT IGG I LFL I TNLQ I GNLFGQHRST I 
I TLYNGAFDSSSAVFLI IKLLYEKGISLR/ VLLHLHLCLQ YLAC 
STHFPPDAPGAHPI PTAPQLQLWP VPWEWHHKGREX5/QQLS MKT 
GS YSQRS S FQRRKRPQGQGRSRNS APSGATL/ CS RRFAWHLVWL 
S V I QLWHYL F IGTLNS LLTNMAGGDMAR VS TYTNAF AFTQ FGVL 
CAPWNGLLMDRLKQKYQKEARKTGSSTLAVALCSrVPSLALTSL 
LCLGFALCASVPILPLQYLTFILQVISRSFLYGSNAAFLTIAFP 
S EHFGKLFGLVMALS AWSLLQFP I FTLI KGSLQNDPFYVNVMF 
MLAILLTFFHPFLVYRECRTWKESPSAIA 


5434 


66 


652 


RYAALIISLIQHKLLWRNQHCSRCVIMSPAQSAGLNWLF/GSGK 
HGPFLGCSQYPACDYVRPLKSSADGHIVKVLEGQVCPACGANLV 
LRQGRFGMFIGCINYPECEHTBLIDKPDETAITCPQCRTGHLVQ 
RRSRYGKTFHSCDRYPECQFAINFKPI AGECPE CHYPLL 1 E KKT 
AQG VKHFCAS KQOGKP VS AE 


5435 


4704 


1597 


PGDSSQRLAEMSNAKERKHAKKMRNQPTNVTLSSGFVADRGVKH 
HSGGEKPFQAQKQEPHPGTSRQRQTRVNPHSLPDPEVNEQSSSK 
GMFRKKGGWKAGPEGTSQE I PKY I TASTFAQARAAEISAMLKAV 
T Q KS SNS L V FQTLP RHMRRRAMS HNVKRLP RRLQE IAQKEAEKA 
VHQKKEHS KNKCHKARRCHMNRTLEFNRRQKKNI WLETH I WHAK 
RFHMVKKWGYCLGERPTVKSHRACYRAMTNRCIiLQDLSYYCCLE 
LKGKEEE I LKALSGMCN I DTGLTFAAVHCLSGKRQGSLVLYRVN 
KYPREMLGPVTFIWKSQRTPGDPSESRQLWIWLHPTLKQDILEE 
I KAACQCVEPI KSAVCIADPLPTPSQEKSQTELPDEKIGKKRKR 
KDDGENAKP I KKI IGDGTRDPCLP YSWI S PTTG 1 1 1 SDLTMEMN 
RFRLIGPLSHSILTEAIKAASVHTVGEDTEETPHRWWIETCKKP 
DSVSLHCRQEAIFELLGGITSPAEIPAGTILGLTVGDPRINLPQ 
KKSKALPNPEKCQDNEKVRQLLLEGVPVECTHSFIWNQDICKSV 
TENKISDQDLNRMRSELLVPGSQLILGPHESKIPILLIQQPGKV 
TG EDRLGWG SGWD VLLPKGWGMAFW I P F I YRG VR VGGLKE S A VH 
SQYKRSPNVPGDFPDCPAGMLFAEEQAKNLLEKYKRRPPAKRPN 
YVKLGTLAP FCCPWEQLTQDWESRVQAYEEPS VAS S PNGKE SDL 
RRSEVPCAPMPKKTHQPSDE VGTS I EHPREAEEVMDAGCQES AG 
PERITDQEASENHVAATGSHLCVLRSRKLLKQLSAWCGPSSEDS 
RGGRRAPGRGOQGLTREACLS I LGHFPRALVWVSLSLLSKGS PE 
PHTM I CVPAKEDFLQLKEDWHYCGPQESKHSDP FRS KILKQKEK 
KKRE KRQKP \GRAS S DGPAGE E PVAGQE ALTLGLW SG P LPRVTL 
HCS R TLLGFVTQGD FSMAVG CGEALG F VS LTGLLDMLS SQ P AAQ 
RGLVLLRPPAS LQYRFARI AI E V 


5436 


1781 


635 


ASDS I PWSEARTTRICLAQRGCQWSLPERMPLWFCGLPYSGKSR 
RAEELRVALAAEGRAVYVVDDAAVLGAEDPAVYGDSAREKALRG 
ALRASVERRLSRHDWILDSLNYIKGFRYELY\CLARAARTPLC 
LVYCVRPGGPIAGPQVAGANENPGRNVSVSWRPRAEEDGRAQAA 
GSSVLRELHTADSWNGSAQADVPKELEREESGAAESPALVTPD 
SEKSAKHGSGAFYS PELLEALTLRFEAPDSRNRWDRPLFTLVGL 
EEPLPLAGIRSALFENRAPPPHQSTQSQPLASGSFLHQLDQVTS 
QVLAGLMEAQKSAVPGDLLTLPGTTEHLRFTRPLTMAELSRLRR 
QFISYTKMHPNNENIiPQLANMFLQYLSQSLH 


543 7 


739 


1672 


CQEAASEFGGPLHTPAMFLRRLGGWLPRPWGRRKPMRPDPPYPE 
PRRVDSSSENSGSDWDSAPETMEDVGHPKTXDSGALRVSRAASE 
PSKEEPQVEQLGSKRMDSLKWDQPISSTQESGRLEAGGASPKLR 
WDHVDSGGTRRPGVSPEGGL\GVPGPGAPLEKPGRREKLLGWIiR 
GEPGAPSRYLGGPEECLQISTNLTLHLLELLASALLALCSRPLR 
AALDTLG LRGP LGLW LHGLLS FLAALHGLHAVL S LLTAH P LH FA 
CLFGLLQALVLAVSLREPNGDEAATDWESEGLEREGEEQRGDPG 
KGL 


5438 


2443 


1152 


TKPRKRRHQPASQRQRPWSSDSTGDLLARGKGRKEENKGSDRVS 
LAP PS L RR PMMCQS EARQ^ PELRAAKWLHFPQLiALRRRLGQLS C 
MSRPALKLRSWPLTVLYYLLPFGALRPLSRVGWRPVSRVALYKS 
VPTRLLSRAWGRLNQVELPHWLRRPVYSLYIWTFGVNMKEAAVE 
D LHHYRNLS E FFRRKL K PQARP VOSLHS VI S P S DGR I LNFGQVK 
NCEVEQVKGVTYSLESFLGPRMCTEDLPFPPAASCDSFKNQLVT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C«Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F* Phenyl alanine , G«Glycine, 
H=Histidine, I-Ieoleucine, K-Lysine, 
L-Leucine, M-Methionine , N*Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Thr&onine, V= Valine, 
W=Tryptophan, Y= Tyrosine , X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 








REGNELYHCVIYLAPGDYHCFHSPTDWTVSHRRHFPGSLMSVNP 
GMARWIKELFCHNERWLTGDWKHGFFSLTAVGAT\NWGSIRIY 
FDRDIiHTNSPRHSKGSYNDFSFVTHTNREGVPMALRGEHLG/QS 
FNLGSTIVLIFEAPKDFNFQLKTGQKIRFGEALGSL 


5439 


2443 


1152 


TKPRKRRHQPASGRQRPWSSDSTGDLLARGKGRKEENKGSDRVS 
LAP P S LRR P MMCQSEARQGP ELRAAKWLHFPQLALRRRLGO LS C 
MSRPALKXRSWPLTVLYYLLPFGALRPLSRVGWRPVSRVALYKS 
VPTRLLSRAWGRLNQVELPHWLRRPVYSLYIWTFGVNMKEAAVE 
DLHHYRNL S E F FRRKLKPQ ARP VCGLHS VI S PS DGR I LNFGQ VK 
NCEVEQ VKGVT YSLES FLGPRMCTEDLP FPPAAS CDS FKNQLVT 
REGNELYHCVIYLAPGDYHCFHSPTDWTVSHRRHFPGSLNSVNP 
GMARW I KELFCHNERWLTGDWKHGFFS LTAVGAT\NWGS I RI Y 
FDRDLHTNS PRHS KGS YNDFSFVTHTNREGVPMALRGEHLG /QS 
FNLGSTIVLIFEAPKDFNFQLKTGQKIRFGEALGSL 


5440 


693 


253 


EPIPVTPDHRLVTMTHIV\QTFSPVNS \GQPPNYEMLKEEQEVA 
MLGAPHNPAP PMS TVI H I RS ET S VP D HVVWSL FNTLFMNTCCLG 
F IAFAYS VKSRDR KMVGD VTGAQAYAS TAKCLN I WALI LGI FMT 
ILLIIIPVLWQAQR 


5441 


2 


2054 


CRDGGKNGFMVS PMKPLE IKTQCSGPRMDPKICPADPAFFS FIN 
NSDLWVANIETGEERRLTFCHQGLSNVLDDPKSAGVATFVIQEE 
FDRFTGYWWCPTASWEGSEGLKTLRILYEEVDESEVEVIHVPSP 
ALEERKTDSYRYPRTGSKNPKIALKLAEFQTDSQGKIVSTQEKE 
LVQPPSSLFPKVEYIARAGWTRDGKYAWAMFLDRPQQWLQLVLL 
P P AL FI PS TENEEQ \ RLAS ARAVPRNVQP YWYEE VTNVWINVH 
DIFOfPFPQSEGEDELCFLRANECKTGFCHLYKVTAVLKSGjGYDW 
S EPFS PGEGEQS LTNAI WVNE ETKLVY FQGTKDT PLEHHLYWS 
YEAAG E I VRLTTPG FS HSCS MSQNFDMFVSH YSS VST P PC VHVY 
KLSGPDDDPLHKQPRFWASMMEAAKIFHFHTRSDVRLYGMIYKP 
HALGPGKKHPTVLFVYGGPQVQLVNNSFKGIKYLRLNTLASLGY 
A VWI DGRGSCQRGLRFEGALKNQMGQ VEI EDQVEGLQ FVAEKY 
GFIDLSRVAIHGWSYGGFLSLMGLIHKPQVFKVAIAGAPVTVWM 
AYDTGYTERYMDVPENNQHGYEAGSVALHVEKLPNEPNRLLILH 
G FLDENVHFFHTNFLVSQLI RAGKPYQLQVALPPVS PQ I YPNER 
HS IRCPESGEHYE VTLLHFLQEYL 


5442 


1 


3474 


CGQRSRRRS PDMPEAKPAAKKAPKGKDAPKGAPKEAPPKBAPAE 
APKEAPPEDQSPTAEEPTGVFLKKPDSVSVETGKDAVWAKVNG 
KELPDKPTIKWFKGKWLELGSKSGARFSFKESHNSASNVYTVEL 
H I GKVVLGDRG YYRL E VKAKDTCDS CGFN I D VEAPRQD AS GQS L 
E S FKRTS E KKS DT AGE LDFSGLL KKREWE EEKKKKKKDDDDLG 
I P PE IWELLKGAKKSE YEKI AFQYGI TDLRGMLKRLKKAKVE VK 
KSAAFTKKLDPAYQVDRGNKIKLMVEISDPDLTLKWFKNGQEIK 
P S S KYVFENVGKKR I LT INKCTLADDAAYE VAVKDEKCFTELFV 
KEPPVLIVTPLEDQQVFVGDRVEMAVEVSBEGAQVMWMKDGVEIi 
TREDSFKARYRFKKDGKRHILIFSDWQEDRGRYQVITNGGQCE 
AE LI VEE KQLEVLQD I ADLTVKAS EQ AVF KCEVS DEKVTG KWYK 
NGVEVRPSKRITISHVGRFHKLV1DDVRPEDEGDYTFVPDGYAL 
GSLSAKLNFLEIKVEYVPKQ\EPPKI PLGFASGGKTSENAD/ IV 
WAG?fKLRLDV\SITGEAPSPFA7\WLKG\DEVFTTTEGRTRIE 
KR VDCSS FVI ESAQREDEGRYTI KVTNP IGED VAS I FLQWD VP 
DPPEAVRITSVGEDWAILVWEPPMYDGGKPVTGYLVERKKKGSQ 
RWMKLNFEVFTETTYESTKMIEGILYEMRVFAVNAIGVSQPSMN 
TKPFMPIAPTSEPLHLIVEDVTDTTTTLKWRPPNRIGAGGIDX3Y 
LVEY CLEGS EE W VP ANTE P VERCG FTVKNLPTGAR I LFR WGVN 
IAGRS EPATLAQPVT I RE I AEPPKI RL PRHLRQTY I RKVGEQLN 
LWP FQG K P R P Q VVWTKGGAPLD TS R VHVRTS D FDTVFF VRQAA 
RSDSGEYELSVQIENMKDTATIRIRVVEKAGPPINVMVKEVWGT 
NALVEWQAPKDDGNSEIMGYFVQKADIOCTMEWFW^ERNPJITSC 
TVS DL I VGNE Y Y FR VYTEN I CGLS DSP G VS KNTAR I LKTG I TFK 
PFEYKEHDFRMAPKFLTPLIDRVWAGYSAALNCAVRGHPKPKV 
VWMKNKME I REDP KFL I TNYQG VLTLNI RRPS PFDAGTYTCRAV 
NELGEALAECKLEVRVPQ 



322 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T»Threonine, V=Valine, 
W*Tryptophan, Y- Tyrosine, X= Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\epossible nucleotide insertion) 


5443 




1003 


SRGQLDAGQSSEQHGGNRQPEQSRSRSSSSSSSPHRSRSAAEPA 
MALSMPLNGLKEEDKEPLIELFVKAGSDGESIGNCPFSQRLFMI 
LWLKG WF S VTT VD LKRKP ADLQNLAPGTH P P F I TENS E VKTD V 
NKIEEFLEEVLCPPKYLKLSPKHPESNTAGMDIFAKFSAYIKNS 
RPEANEALERGLLKTLQKLDEYLNSPLPDEIDENSMEDIKFSTR 
KFLDGNEMTLADCNLLPKLHI VKWAKKYRNFDI PKEMTG I WRY 
LTNAYSRDEFTNTCPSDKEVE I \ AY S DVAKRLHQVKS RLLKE VS 
FMSSP 


S444 


2 


344 


SGPIGVTGAQMAKWLRDYLSFGGRRPPPQPPTPDYTESDILRAY 
RAQKNIJDFEDP Y*DSESRLEPDPAGPGDS KNPGDAKYG S PKHRL 
IKVEAADMARAKALLGGPGEELEADTEYLDPFDAQPHPAPPDDG 
YMEPYDAQWVMSELPGRGVQLYDTPYEEQDPETADGPPSGQKPR 
QSRMPQEDER PADE YDQPWEWKKDH I S RAFAVQFDS PEWERTPG 
SAKELRRPPPRSPQPAERVDPALPLEKQPWFHGPLNRADAESLL 
SLCKEGSYLVRLSETNPQDCSLSLRSSQGFLHLKFARTRENQW 
LGQHSGPFP£VPELVLHYSSRPLPVQGAEHLALLYPVVTQTP*Q 
+ PDWGDRRPNGQVATGLPELWGAEAPSAAAHPGLHRERHPEGLP 
RAEKPGLRGPLLGLREPLGAGPRG p WGLQE prrcqvwfs qapah 
QGGGCGYGOSQGPSGRPRGGAGS RH 


5445 


2364 


486 


ILSRGFLGSVEICIQLPLPASEPVLLLTWARRRWRETRSRREPT 
TLRAQSVCPWWI*ETRMNRSIPVEVDESEPYPSQLLKPIPEYSP 
EEESEPPAPNIRNMAPNSLSAPTMLHNSSGDFSQAHSTLKIiANH 
QRPVSRQVTCLRTQVLEDSEDSFCRRHPGLGKAFPSGCSAVSEP 
ASESWGALPAEHQFSFMEKRNQWLVSQLSAASPDTCHDSDKSD 
QSLPNASADSLGGSQEMVQRPQPHRNRAGLDLPTIDTGYDSQPQ 
DVLGIRQLERPLPLTSVCYPQDLPRPLRSREFPQFEPQRYPACA 
QMLPPNLSPHAPWNYHYHCPGSPDHQVPYGHDYPRAAYQQVIQP 
ALPGQPLPGAS VRGLHP VQKVILNYPS PWDQEERPAQRDCS FPG 
LPRHQDQPHHQPPNRAGAPGESLECPAELRPQVPQPPSPAAVPR 
PPSNPPARGTLKTSNLPEELRKVFITYSMDTAMEWKFVNFLLV 
NGFQTAIDIFEDRIRGIDIIKWMERYLRDKTVMIIVAISPKYKQ 
DVEGAESQLDEDEHGLHTKYIHRMMQIEF1KQGSMNFRFIPVLF 
PNAKKEHVPTWLQNTHVYSWPKNKKNILLRLLREEEYVAPPRGP 
LPTLQWPL 


5446 


972 


161 


S S WS WCTGRMRKTRLWGLLWMLFVS ELRAATKLTEEK YELKEGQ 
TLD VKCD YTLE KF AS S Q KAWQ 1 1 RDGEMPKTLACTER PS KNS HP 
VQVGRIILEDYHDHGLLRVRMVNLQVEDSGLYQCVIYQPPKEPH 
ML FDR I RLWT KG FSGT PG SNENS TQNVYK I P P TTTKALCPLYT 
TPRTVTQAPPKSTADVSTPDSEINIjTNVTDIIRVPVFNI VILLA 
GGFLSKSLVFSVLFAVTLRS FVP * AHE PTRMSSDFQPHPSGSCA 
KGGGRR 


5447 


207 


617 


MTARTLS LMAS L VAYDDSDS EAETEHAGS FNATGQQKDTSGVAR 
P PGQDFASGTLDVPKAGAQPTKHGS CE DPGG YRLPLAQLGRSDR 
GSCPSQRLQWPGKEPQVTFPIKEPSCSSLWTSHVPASHMPLAAA 
RFKQ VKLSRNFP KSS FHAQS ESETVGKNGSS FQKKKCEDCWPY 
TPRRLRQRQALSTETGKGKDVEPQGPPAGRAPAPLYVGPGVSEF 
IQPYLNSHYKETTVPRKVLFHLRGHRGPVNTIQWCPVLSKSHML 
LSTSMDKTFKVWNAVDSGHCLQTYSLHTEAVRAARWAPCGRRIL 
SGGFDFALHLTDLETGTQLFSGRSDFRITTLKFHPKDHNIFLCG 
G FS S EMKAWDI RTG KVMRS YKAT IQQTLD I L FLREGSE FLS S TD 

FLAQTNGNYLALFS TVW P YRMS RRRR YEGHKVEGYS VGCECS PG 
GDLLVTGSADGRVLMYSFRTASRACTLQGHTQACVGTTYHPVLP 
SVLATCSWGGDMKIWH*AFHWLSLGEAIGDLAPARGYSGPGRSL 
KS PS PS KS LLVLLCGRAMFQ PATCP WQLPALSK 


5448 


194 


1833 


MAS KVTDAI VW YQKKIGAYDQQ I WE KSVEQRE I KGLRNKPKKTA 
HVKPDLI DVDLVRGS AFAKAKPES P WTS LTTKGIVRWFFP FF F 
R WWLQVTS KVI FFWLL VL YLLQ VAA I VL FCSTS S PHS I PLTE VI 
G P I WLMLLLGTVH CQI VSTRTP KPP LSTGGKRRRKLRKAAHLEV 
HREGDGSSTTDNTQEGAVQNHGTSTSHSVGTVFRDLWHAAFFLS 
GSKKAKNSIDKSTETDNGYVSLDGKKTVKSGEDGIQNHEPQCET 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspond! ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G.Glycine, 
H=Histidine, I-Isoleucine , K=Lyeine, 
L-Leucine, M-Methionine, N-Asparagine, 
P-Proline, Q-Glutamine, R=Arginine, 
S=Serine, T^Threonine, V*Valine, 
W= Tryptophan, Y= Tyrosine, X=Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








IRPEETAWNTGTLRNGPSKDTQRTITNVSDEVSSEEGPETGYSL 
RRHVDRTSEGVLRNRKSHHYKKHYPNEDAPKSGTSCSSRCSSSR 
QDSESARPES ETED VLWE D L LHCAE CHS S CTS E TDVENHQ INPC 
VKKEYRDDPFHQSHLPWLHSSHPGLEKISAIVWEGNDCKKADMS 
VLB I SGM I MNR VNSHI PG I G YQI FGNAVSL I LGLTPFVFRLS QA 
TDLEQLTAHSASELYVIAFGSNEDVIVLSMVIISFWRVSLVWI 
FFFLLCVAERTYKQVGIM*TSEGVLRNRKSHHYKKHYPNEDAPK 
SGTSCS SRCS S S RQDSES ARPESETED VLWEDLLHCAECHS S CT 
SETDVENHO I NPCVKKE YRDDPFHQSHLPWLHSSHPGLEKI SAI 
VWEGNDCKKADMS VLE I SGM I MNRVNSH I PG I GYQI PGNAVSLI 
LGLTPFVFRLSQATDLEQLTAHSASELYVIAFGSNEDVIVLSMV 
IISFWRVSLVWIFFFLLCVAERTYKQVGIM 


5449 


194 


1833 


MAS KVTDAI VW YQ KKI GA YDQQ I WE KS VEQRE I KGLRNKP KKTA 
HVKPDLIDVDLVRGSAFAKAKPESPWTSLTTKGIVRWFFPFFF 
RWWLQVTS KVI FFMLLVL YLLQ VAAI VL FCSTS S PHSIPLTE VI 
GPIWLMLLLGTVHCQIVSTRTPKPPLSTGGKRRRKLRKAAHIiEV 
HREGDGSSTTDNTQEGAVQNHGTSTSHSVGTVFRDLWHAAFFLS 
GS KKAKNS I D KS TETDNG Y VS LDG KKTVKSGEDG I QNHEP QC ET 
IRPEETAWNTGTLRNGPSKDTQRTITNVSDEVSSEEGPETGYSL 
RRHVDRTSEGVLRNRKSHHYKKHYPNEDAPKSGTSCSSRCSSSR 
QDSESARPESETEDVLWEDLLHCAECHSSCTSETDVENHQ INPC ' 
VKKE YRDD P FHQS HLPWLHS S HPG LE K I S AI VW E GNDCKKADMS 
VLEISGMIMNRVNSHIPGIGYQIFGNAVSLILGLTPFVFRLSQA 
TDLEQLTAHSASELYVIAFGSNEDVIVLSMVIISFWRVSLVWI 
FFFLLCVAERTYKQVGIM * T S EG VLRNRKSHH Y KKH YPNEDAP K 
SGTSCSSRCSSSRQDSESARPESETEDVLWEDLLHCAECHSSCT 
S ETDVENHQ INPCVKKEYRDD PFHQSHLPWLHS SHPGLEKI SAI 
VWEGNDCKKADMS VLE I SGM I MNRVNSH I PGIG YQ I FGNAVS L I 
LGLT P FVFRLS QATDLEQ LTAH S AS E L YVI AFGSNE D VI VLS MV 
I ISFWRVSLVWIFFFLLCVAERTYKQVGIM 


5450 


8136 


1242 


GQQFASFFG*NHPEVTVAMALTDIDLQLQFSMSQPEAIjLLLAAG 
PADHLLLQLYSGHLQVRLVLGQEELRLQTPAETLLSDS I PHTW 
LTVVEGWATLSVDGFLNASSAVPGAPLEVPYGLFVGGTCTLGLP 
YLRGTS RPLRG CLHAATLNGRSLLRPLTPD VHEG CAEEFSAS DD 
VAIiGFSGPHSIiAAFPAWGTQDEGTLEFTLTTQSRQAPLAFQAGG 
RRGDF I YVD I FEGHLRAWE KGQGTVLLHNS VP VADGQPHE VS V 
HINAHRLEISVDQYPTHTSNRGVLSYLEPRGSLLLGGLDAEASR 
HLQEHRLGLTPEATNASLLGCMEDLSVNGQRRGLREALLTRNMA 
AGCRLEEEEYEDDAYGHYEAFSTLAPEAWPAMELPEPCVPEPGL 
PPVFANFTQIiLTI S P L WAEGGTAWL E WRH VQP TLDLMEAELRK 
SQVLFSVTRGAHYGELELDILGAQARKMFTLLDVVNRKARFIHD 
GSEDTS DQLVLEVS VT ARVPM P S CLRRGQTYLLP I Q VNPVND P P 
HIIFPHGSLMVILEHTQKPLGPEVFQAYDPDSACEGLTFQVLGT 
SSGLPVERRDQPGEPATEFSCRELEAGSLVYVHCGGPAQDLTFR 
VS DGLQAS P P ATL KWAI RPAI Q I HRS TGLRLAQGS AM P I LPAN 
LS VETNAVGQDVS VLFRVTGALQ FGELQKHSTGGVEGAEWWATQ 
AFHQRDVEQGRVRYLSTDPQHHAYDTVENLAIjEVQVGQE I LSNL 
S FPVTIQRATVWMLRLEPLHTQNTQQETLTTAHLEATLEEAGPS 
PPTFHYEWQAPRKGNLQLQGTRLSDGQGFTQDDIQAGRVTYGA 

taraseavedtfrfrvtappyfsplytfpihiggdpdapvltnv 

LLVVPEGGEGVLSADHLFVKSLNSASYLYEVMERPRLGRIiAWRG 
TQD KTTMVTS FTNEDLLRGRLVYQHDDSETTEDD I PFVATRQGE 
SSGDMAWEEVRGVFRVAIQPVNDHAPVQTI SRI FHVARGGRRLL 
TTDDVAFS DADSG FADAQ LVLTR KD LL FGS I VAVDEPTR P I YRF 
TQEDLRKRRVLFVHSGADRGWIQLQVSDGQHQATALLEVQASEP 
YLRVANGSSLWPO^GQGTIDTAVLHLDTNLDIRSGDEVHYHVT 
AGPRWGQLVRAGQPATAFSQQDLLDGAVLYSHNGSLS PSDTMAF 
S VEAGPVHTDATLQ VT I ALEGPLAPLKL VRHKKI YVFQGEAAE I 
RRDQLEAAQE AVP PAD I VFS VKS P P SAG YLVMVSRGALADE P PS 
LDPVQSFSQEAVDTGRVLYLHSRPEAWSDAFSLDVASGLGAPLE 
GVL VELE VL P AA I PLE AQNFS VPEGGS LT LAP P LLR VSG P Y F PT 
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SEQ 
ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

co rr e epondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 1 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CaCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G-Glycine, 
H=Histidine, I-Isoleucine, K*Lysine, 
L-Leucine, M-Methionine , N-Asparagine, 
P=»Proline, Q=Qlut amine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine , X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








LLGLSLQVLEPPQHGPLQKECX3PQARTLSAFSWRMVEEQLIRYV 
HDG S ETLT D S F VLMANAS EMDRQS HP VAFT VTVLP VNDQ P P I LT 
TNTGLQMWEGATAPIPAEALRSTDODSGSEDLVYTIEQPSNGRV 
VLRGAPGTE VRS FTQAQ LDGGL VLFS HRGTLDGG FPFRLS DGEH 
T S PGHF FR VT AQKQ VLLS LKGSQTLTVC PG S VQ PLSSQTLRAS S 
SAGTDPQLLLYRWRGPQLGRLFHAQQDSTGEALVNFTQAEVYA 
GN 1 L YEHEMP PE P FWEAHDTLELQL S S P P ARDVAATLAVAVS FE 
AACPQR PSHL WKN KGL WVP EGQRAR I T VAALDASNLLAS VPS PQ 
RSEHDVLFQVTQFPSRGQLLVSEEPLHAGQPHFLQSQLAAGQLV 
YAHGGGGTQQDGFHFRAHLQGPAGASVAGPQTSEAFAITVRDVN 
ER P PQ PQAS VPLR LTRGS RAP ISRAQL S WD PDS APGE I E YE VQ 
RAPHNGFLSLVGGGLGPVTRFTQADVDSGRLAFVANGSSVAGIF 
QLSMSDGASPPLPMSLAVDILPSAIEVQLRAPLEVPQALGRSSL 
SQQQLRWSDREEPEAAYRLIQGPQYGHLLVGGRPTSAFSQFQI 
DQGEWFAFTNFSSSHDHFRVLAIJUiGVNASAVVNVTVPJ^IW 
WAGGPWPQGATLRLDPTVLDAGELANRTGSVPRFRLLEGPRHGR 
WRVPRARTEPGGSQLVEQFTQQDLEDGRLGIiEVGRPEGRAPGP 
AGDS LTLELWAQGVPPAVAS LDFATE P YNAARP YS VALLS VPEA 
ARTEAG KPESSTPTGEPGPMASSPEPAVAKGG FLS FLEANMFSV 
1 1 PM C L VLLLLAL I LP LL F Y LjRKRNKTG KHD VQ VLT AKP RNGLA 
GDTET FRKVE PGQ AI PLTAVPGQG P P PGGQP D PELLQF CRT PNP 
ALKNGQYWV 


5451 


1 


2274 


RDS S EQGRTGDTLGRPSACMDALKP PCLWRNHE RGKKDRDS CGR 
KNSEPGSPHSLEALRDAAPSQGLNFLLLFTKMLFIFNFLFSPLP 
TPALI CILTFGAAI FLWLITRPQPVLPLLDLNNQSVGIEGGARK 
G VS Q KNNDLTS CC F S DAKTM YEV FQRGLAVS DNGP CLG YR KPNQ 
PYRWLSYKQVSDRAEYLGSCLLHKGYKSSPDQFVGIFAQNRPEW 
IISELACYTYSMVAVPLYDTLGPEAIVHIVNKADIAMVICDTPQ 
KAL VL I GNVEKG FT P S LKV 1 1 LMDP FDDDLKQRGE KSGIEILSL 
YDAENU3KEHFRKPVPPSPEDLSVICFTSGTTGDPKGAMITHQN 
I VSNAAAFLKC VEHAYE PTPDDVAI S YLPLAHMFERI VQAWYS 
CGARVG FFQGDIRLLADDMKTLKPTLF PAVPRLLNRI YDKVQNE 
AKTPLKKFLLKLAVSSKFKELQKGI IRHDSFWDKLI FAKIQDSL 
GGRVRVIVTGAAPMSTSVMTFFRAAMGCQVYEAYGQTECTGGCT 
FTLPGDWTSGHVGVPliACNYm^EDVADMNYFTVNNEGEVCIKG 
TNVFKGYLKDPEKTQEALDSDGWLHTGDIGRWLPNGTLKIIDRK 
KNI FKLAQGEYIAPEKI ENI YNRSQPVLQ I FVHGESLRSS LVGV 
VVTDTDVLPSFAAKLGVKGSFEELCQNQVVREAILEDLQKIGKE 
SGLKTFEQVKAI FLHPEPFS I ENGLLTPTLKAKRGELSKYFRTQ 
IDSLYEHIQD 


5452 


1833 


1138 


SRVPSLCLSLSLSLSPSREPVAGAPGCGTAGPPAMATLWGGLLR 
LGSLLSLSCLALSVLLLAQLSDAAKNFEDVRCKCICPPYKENSG 
HIYNKN I SQKDCDCLHWE PMP VRGPDVEAYCLRCECKYEERSS 
VTIKVTI 1 1 YLS I LGLLLLYMVYLTL VE P I LKRRLFGHAQL I QS 
DDD I GDHQP FANAH DVLAR S RSRANVLNKVE YAQQRWKLQ VQE Q 
RKSVFDRHWLS 


5453 


111 


1520 


PS I PAAVPQSAPPE PHREETVTATATS QVAQQPPAAAAPGBQAV 
AGPAPSTVPSSTSKDRPVSQPSLVGSKEEPPPARSGSGGGSAKE 
PQEERSQQQDDIEELETKAVGMSNDGRFLKFDIEIGRGSFKTVY 
KGLDTETTVEVAWCELQDRKLTKSERQRFKEEAEMLKGLQHPNI 
VRFYDSWESTVKGKKCIVLVTELMTSGTLKTYL}<RFKVMKIKVL 
RSWCRQILKGLQFLHTRTPPIIHRDLKCDNIFrTGPTGSVKlGD 
LGLATIiKRASFAKSVIGTPEFMAPEMYEEKYDESVDVYAFGMCM 
LEMATSEYPYSECQNAAQIYRRVTSGVKPASFDKVAIPEVKEII 
EGCIRQNKDERYS IKDLLNHAFFQEETGVRVELAEEDDGEKIAI 
KLWLR I ED I KKLKGKY KDNEAI EFS FDLERNVPEDVAQEMVE SG 
YVCEGDHKTMAKAI KDRVSLI KRKREQRQL * 


5454 


111 


1520 


PS I PAAVPQSAP PE PHREETVTATATS QVAQQ P PAAAAPGEQA V 
AG PAPSTVPSSTS KDRP VSQP S LVGSKEE P P PARSGSGGGSAKE 
P Q E ERS QQQDD I E E LE T KAVGMSNDGRFLKFD I EI ORGS FKTVY 
KGLDTETTVEVAWCELQDRKLTKSERQRFKEEAEMLKGLQHPNI 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine, G-Glycine, 
H=Histidine, I-Isoleucine, K-Lysine, 
L- Leucine, M-Methionine, N«Asparagine, 
P»Proline, Q-Glut amine , R=Arginine, 
S -Serine, T=Threonine, VssValine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 








VRFYDSWESTVKGKKCIVLVTELMTSGTLKTYLKRPKVMKIKVL 
RSWCRQILKGLQFLHTRTPPI IHRDLKCDNI FITGPTGSVKIGD 
LGLATLKRAS FAKSVIGTPEFMAPEMYEEKYDES VDVYAFGMCM 
LE MATS E Y P YS ECQNAAQ I YRRVTSG VKPAS FDKVAI P E VKE 1 1 
EGCIRQNKDERYSIKDLLNHAFFQEETGVRVELAEEDDGEKIAI 
KLWLR I E D I KKLKG KYKDNE AI E FS FDLE RNVP ED VAQEMVES G 
YVCEGDHKTMAKAIKDRVSLIKRKREQRQL* 


5455 ' 


1359 


377 


LTM VS PAT RKS LP KVKAMD FITS TAI LPLL FGCLGVFG LFRLLQ 
WVRG KAYLRNAVW I TG ATSGLGKE CAKVF YAAGAKLVLCGRNG 
G ALEELI RELTASHATKVQTHKP YLVTFDLTDSGAIVAAAAE IL 
■QCFGYVDILVNNAGISYRGTIMDTTVDVDKRVMETNYFGPVALT 
KALLPSMIKRROGHIVAISSIQGKMSIPFRSAYAASKHATQAFF 
DCLRAEMEQYE I EVTV I S PG Y IHTNLS VNAI TADGSRYGVMDTT 
TAQGRS P VEVAQDVLAAVGKKKKDVI LADLL PS LAVYLRTLAPG 
LF FSLMAS RARKERKS KNS 


5456 


2 


2332 


CGAG L VAAG AVLVLY PAS RAGE RTR V PGS PAPSSLPLHS PGACG 
TE VDMD PQRS PLLEVKGNI E LKR PL I KAP SQLPLSGSRLKRRPD 
QMEIXSLEPEKKRTRGIjGATTKITTSHPRVPSLTTVPQTQGQTTA 
Q KVS KKTG PRCS TAIATGLKNQKP VPAVP VQKS GTSGVP PMAGG 
KKPSJCRPAWDLKGQLCDLNAELKRCRERTQTLDQENQQLQDOLR 
DAQQQ VKAI^TERTTLEGHLAKVQAQAEC^QELKNLRACVLEL 
EERLSTQEGLVQELQKKQVELQEERRGLMSQLEEKERRLQTSEA 
ALSSSQAEVASLRQETVAQAALLTEREERLHGLEMERRRLHNQL 
QELKGNIRVFCRVRPVLPGEPTPPPGLLLFPSGPGGPSDPPTRL 
SLSRSDERRGTLSGAPAPPTRHDFSFDRVFPPGSGODEVFEEIA 
MLVQSALDGYPVCIFAYGQTGSGKTFTMEGGPGGDPQLEGLIPR 
ALRHLFS VAQELSGQGWTYS FVASYVE I YNETVRDLLATGTRKG 
QGGECEIRRAGPGSEELTVTNARYVPVSCEKEVDALLHLARQNR 
AVARTAQNERSSRSHSVFQLQISGEHSSRGLQCGAPLSLVDLAG 
SERLDPGLALGPGERERLRETQAINSSLSTLGLVIMALSNKESH 
VPYRNS KLTYLLQNSLGGSAKMLMFVNI SPLEENVSESLNS LRF 
ASKVEPS\^FGTAQ3NRKWKTDPDLCVCVCVCVCVCVCVCVCVP 
MSM YRVRGGRVAGGCFIGWRAPCPRAI K 


5457 


2 


1540 


DDFVERRRWTRTTCLVRS P PH VPVCGHACSWNGGSLDPLKGTPA 
LLRS AERLMRKVKKLRLDKENTGSWRS FSLNS EGAERMATTGTP 
TADRGDAAATDDPAARFQVQKHSWDGLRS I IHGSRKYSGLI VNK 
APHD FQF VQKTDESG PH S HRL YYLGMP YG SRENS LL YSE I P KKV 
RKEALLLLSWKQMLDHFQATPHHGVYSREEELLRERKRLGVFGI 
TS YD FHS E SGLFL FQASNS LFHCRDGGKNGFMVS PGPGCVS P MK 
PLEIKTQCSGPRMDPKICPADPAFFSFINNSDLWVANIETGEER 
RLTFCHQGLSNVLDDPKSAGVATFVIQEEFDRFTGYWWCPTASW 
EGS EGLKT LRI LYE E VDES B VE VTHVP S P ALEE R KTDS YRY PRT 
GSKNPKIALKLAEFQTDSQGKIVSTQEKELVQPFSSLFPKVEYI 
ARAGWTRDGKYAWAMFLDRPQQWLQLVLLPPALFIPSTENEEQA 
ASLCQS CPQECPAVCGVRGGHQRLDQCS 


5458 


6642 


4022 


FVPG LRE PQWEPAQP S ATMSAPSEEEE YARliVMEAQPEWLRAEV 
KRLSHELAETTREKIQAAEYGLAVLEEKHQLKLQFEELEVDYEA 
IRSEMEQLKEAFGQAHTNHKKVAADGESREESLIQESASKEQYY 
VRKVLELQTELKQLRNVLTNTQSENERLASVAQBLKEINQNVEI 
QRGRLRDDIKEYKFREARLLQDYSELEEENISLQKQVSVLRQNQ 
VEFEGLKHEIKRLEEETEYLNSQLEDAIRLKEISERQLEEALET 
LKTEREQKNSLRKELSHYMSINDSFYTSHLHVSLDGLKFSDDAA 
E PNND AEALVNG FEHGG LAKL P LDNKTS TP KKEGLAP PS PS LVS 
DLLSELNISEIQKLKQQLMQMEREKAGLLATLQDTQKQLEHTRG 
SLSEQQEKVTRLTENLSALRRLQASKERQTALDNEKDRDSHEDG 
DYYEVDINGPEILACKYHVAVAEAGELREQLKALRSTHEAREAQ 
HAEEKGRYEAEGQALTEKVSLLEKASRQDRELLARLEKELKKVS 
DVAGETQGSLSVAQDELVTFSEELANLYHHVCMCNNETPNRVML 
DYYREGQGGAGRTSPGGRTS PEARGRRS PILIiPKGLLAPEAGRA 
DGGTGDSSPSPGSSLPSPLSDPRREPMNIYNLIAIIRDQIKHLQ 
AAVDRTTELS RQR I AS QELGPAVDKDKEALMEEILKLKS LLSTK 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S^Serine, T^Threonine, V«Valine, 
W -Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 








REQITTLRTVLKANKQTAEVALANLKSKYENEKAMVTETMMKLR 
NELKALKEDAATFSSLRAMFATRCDEYITQLDENQRQLAAAEDE 
KKTLNS LLRMAI QQ KLALTQRLE LLELDHEQTRRG RAKAAPKTK 
PATPSVSHTCACASDRAEGTGLANQVFCSEKHSIYCD 


5459 


316 


1262 


RGGHRLSGMASNFND IVKQGYVR I RSRRLG I YQRCWLVFKKAS S 
KGPIOILEKFSDERAAYFRCYHICVTELNNVKNVARLPKSTKKHAI 
G I YFNDDTS KTFACES DLEADEW CKVLQMECVGTRIND I S liGE P 
DL1ATG\^REQSERFNVYLMPSPNLGCYMGECALQITYEYICLW 
D VQNP R VKL I S WPLS ALRR YGRD TT WFTFEAGRMCE TGEGLF I F 
QTRDGEAIYQKVHSAALAIAEQHERLLQSVKNSMLQMKMSERAA 
SLSTMVPLPRSAYWQHITRQHSTGQLYRLQDVSSPLKLHRTETF 
PAYRSEH 


5460 


45 


2097 


R PGCRAGELSTGS RARERVRNRVSAPCGQDSRRCDPE VLRGRS P 
GLGLAEMPS CGACTCGAAAVRLI TS SLAS AQRGI SGGRIHMSVL 
GRLGTFETQ 1 LQRAPLRS FTETPAY FAS KDG I S KDGSGDGNKKS 
AS EG S S KKSG S GNSG KGGNQLRC P KCGDLCTHVETF VS S T RFVK 
CEKCHHFFVVLSEADSKKSIIKEPESAAEAVKIiAFQQKPPPPPK 
KI YNYLDKYWGOSFAKKVLS VAVYNHYKRI YNNIPANLRQQAE 
VEKQTSLTPRELEIRRREDEYRFTKLLQIAGISPHGKAIjGASMQ 
QQVNQQ I PQEKRGGEVLDSSHDD I KLEKSNIIiLLGPTGSGKTLL 
AQTIiAKCLDVPFAICDCTTLTQAGYVGEDIESVIAKLLQDANYN 
VEKAQQGIVFLDEVDKIGSVPGIHQLRDVGGEGVQQGLLKLLEG 
T I VNVP EKNSRKLRG E T VQVDTTN I LFVAS GAFNGLDR I I SRR K 
NE K YLG FGT PS N LGKGRRAAAAAD LANRSGESNTHQD I E E KDRL 
LRHVEARDLIEFGMIPEFVGRLPWVPLHSLDEKTIjVQILTEPR 
NAVIPQYQALFSMDKCELNVTEDALKAIARIiALERKTGARGLRS 
I MEKLLIiEPMFE VPNS D I VCVEVDKE WEGKKE PGYIRAPTKES 
SEEEYDSGVEEEGWPRQADAANS 


5461 


1481 


160 


INPPPPPKSPCGRARKWRRRRRPGAPEAAVMELPSGPGPERLFD 
SHRLPGDCFLLLVLLLYAPVGFCLLVLRLFLGIHVFLVSCALPD 
S VLRRFWRTMCAVLGLVARQEDS GLRDHS VRVL I SNHVTPFDH 
NI VNLLTTCST PLLNS P PS FVCWSRGFMEMNGRGELVES LKRFC 
ASTRLPPTPLLLFPEEEATNGREGLLRFSSWPFSIQDWQPLTL 
Q VQR PL VS VTVS DASWVS ELLWS L FVF FTVYQVRWLRP VHRQLG 
EANEEFALRVQQLVAKELGQTGTRLTPADKAEHMKRQRHPRLRP 
QSAQSSFPPSPqPSPDVQLATIAQRVKEVLPHVPLGVIQRDLAK 
TGCVDLT I TNLLEG AVAFMP E DI T KGTQ SLPTAS AS KF P S S GP V 
TPQPTALTFAKS S WARQESLQ ERKQAL YEYARRRFTERRAQEAD 


5462 


663 


3353 


KIKERQMSANNSPPSAQKSVLPTAIPAVLPAASPCSSPKTGLSA 
RLSNGS FS APS LTNSRGSVHTVS FLLQ I GLTRES VTIEAQELS L 
S AVKDL VC S JVYQKF PE CG FFGMYDKI LLFRHDMNSENI LQL I T 
SADE I HEGDLVEWLSALATVEDFQ I RPHTLYVHS YKAPTFCD Y 
OSEMLWGLVRQGLKCEGCGLNYHKRCAFKIPNNCSGVRKRRiSN 
VSLPGPGLSVPRPLQPEYVALPSEESHVHQEPSKRIPSWSGRPI 
WMEKMVMCRVKVPHTFAVHS YTRPT I CQYCKRLLKGLFRQGMQC 
KDCKFNCHKRCASKVPRDCLGEVTFNGEPSSLGTDTDIPMDIDN 
NDINSDSSRGLDDTEEPSPPEDKMFFLDPSDLDVERDEEAVKTI 
SPSTSNNIPIJ^WQSIKHTKRKSSTMVKEGWMVHYTSRDNLRK 
RHYWRLDS KCLTL FQNESG S KY Y KEIPLSEILRISS PRDFTNI S 
QGSNPHCFEIITDTMVYFVGENNGDSSHNPVLAATGVGLDVAQS 
WEKAI RQALMPVTPQASVCTSPGQGKDHKDLSTS I S VSNCQI QE 
NVDISTVYQIFADEVI^SGQFGIV^GGiCHRKTGRDVArKVIDKM 
R FPTKQE S Q LRNE VA I LQN LHHPG I VNLECMFET PER VF WME K 
LHGDMIiEMILSSEKSRLPERITKFMVTQILVALRNLHFKNIVHC 
DLKPENVLLAS AE P F PQVKLCDFGFAR I IGEKSFRRS WGT PAY 
LAPEVLRSKGYNRSLDMWS VGVI I YVS LSGTFPFNEDEDINDQI 
QNAAFMYPPNPWREISGEAIDLINNLLQVKMRKRYSVDKSLSHP 
W LQD YQTWLDLRE FETRI G ER Y I THE S DDARWE I HAYTHNL VY P 
KHFIMAPNPDDMEEDP 


5463 


237 


1012 


LLSVTMTTSRCSKLPEVLPDCTS S AAP WKTVEDCGSLVNGQ PQ 
YVMQVS AKDGQLLSTVVRTLATQS PFNDR PMCRI CHEGS SQEDL 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re epanding 

to first 

amino acid > 

vrt oi Hup e~if 
I c o J_ li Uc Ul 

amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

amino acrid 

sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=»Glycine, 

H-H-i cHrHnp T-Tqnl Purl KsT.ueinP 

Ls=Leucine, (^Methionine, N^Asparagine, 

P=Proline, Q^Glutamine, R=Arginine, 

S= Serine, T= Threonine, V^Valine, 

W=Tryp tophan , Y»Tyrosine, X«UnJcnown, *=Stop 

Codon, /=possible nucleotide deletion, 

\spossible nucleotide insertion) 








LSPCECTGTLGTIHRSCLEHWLSSSNTSYCELCHFRFAVERKPR 
P L VEWLRN PGP QHE KRTLFGDM VCFL F I TPLAT I S GWLCLRGAV 
DHLHFS S RLEAVGL I ALTVALFT I YLF WTLVS FRYHCRL YNEWR 
RTNQRVILLIPKSVNVPSNQPSLLGLHSVKRNSKETW 


5464 


195 


677 


SPSMNPRKKVDLKLIIVGAIGVGKTSLLHQYVHKTFYEEYQTTL 
GAS ILSKI I IU3DTTLKLQI WDTGGQERVRSMVSTFYKGSDGCI 
LAFDVTDLESFEALDIWRGDVLAKIVPMEQSYPMVLLGNKIDLA 
DRKYQS I LENHLTES I KLSPDQSRSRCC 


5465 


5278 


3348 


KGDPREFI RVHREALECDYVS AHLHEW IDLI FGYKQQGPAAVEA 
VNVFHHLFYEGQVDIYNINDPLKETATIGFINNFGQIPKQLFKK 
PHPPKRVRSRLNGDNAGISVLPGSTSDKIFFHHLDNLRPSLTPV 
KELKEPVGQI VCTDKGIIAVEQNKVLI P PTWNKTFAWGYADLS C 
R LGT YES D KAMTVYECLS EWGQ 1 LCAI C PNP KLV I TGGTSTWC 
VWEMGTS KE KAKTVTLKQALLGHTDTVTCATAS LAYHI IVSGSR 
DRT C 1 1 WDLNKLS FLTQLRGHRAP VS ALC INELTGD I VS CAGT Y 
IHVWS INGNPI VSVNTFTGRSQQI ICCCMSEMNEWDTQNVI VTG 
HSDGWRFWRMEFLQVPETPAPEPAEVLEMQEDCPEAQIGQEAQ 
DEDSSDSEADEQSISQDPKDTPSQPSSTSHRPRAASCRATAAWC 
TDSGSDDSRRWSDQLSLDEKDGFIFVNYSEGQTRAHLQGPLSHP 
HPNP I EVRNYS RLKPGYRWERQLVFRSKIiTMHTAFDRKDNAHPA 
E VTALG IS KDHSRI L VGDS RGKVFS WS VSTOPGRSAADHWVKIJE 
GGDSCSGCSVRFSLTERRHHCRNCGQLFCQKCSRFQSEIKRLKI 
S S P VR VCQNC Y YNLQHERGSEDG PRNC 


5466 


3 


992 


HACAHASAHASGRLVRWWRKRRS VMG IQTS PVLLASLGVGLVTL» 
LG1AVGSYLVRRSRRPQVTLLDPNEKYLLRLLDKTTVSHNTKRF 
RFALPTAHHTLGLPVGKHIYLSTRIDGSLVIRPYTPVTSDEDOG 
YVDLVIKVYLKGVHPKFPEGGKMSQYLDSLKVGDVVEFRGPSGL 
LTYTGKGHFN1QPNKKSPPEPRVAKKLGMIAGGTGITPMLQLIR 
AILKVPEDPTQCFLLFANQTEKDIILREDLEELQARYPNRFKLW 
FTLDHPPKDWAYSKGFVTADMIREHLPAPGDDVLVLLCGPPPMV 
QLACHPNLDKLGYSQKNR FTY 


5467 


2103 


4 


GEALRVGTRGCRRDLPDPQARIFIQKKDLEEDESVTAAHLKSRG 
RSPRKIDQFCNSSNMVHGSVTFRDVAIDFSQEEWECLQPDQRTL 
YRDVMLENYSHLISLAGSSISKPDVITLLEQEKEPWMWRKETS 
RRYPDLELKYGPEKVSPENDTSEVNLPKQVIKQISTTLGIEAFY 
FRNDSEYRQFEGLQGYQEGNINQKMISYEKLPTHTPHASLICNT 

I QLTRHQ KFHTGE KTFECKECGKAFNL P TQLNRHKN I HTVKKL F 
ECKECGKS FNRS SNLTQHQS IHAGVICPYQCKECGKAFNRGSNLI 
QHQKIHSNEKPFVCKECGMAFRYHYQLIEHCQIHTGEKPFECKE 
CGKAFTLLTKLVTlHQKIHTGEKPFECRETOKAFSIiLNQLNRHKN 
IHTGEKPFECKECGKSFNRSSNLVQHQSIHAGIKPYECKECGKG 
FNRGAHL IQHQKIHSNEKPFVCRECEMAFRYHCQL I EHSR IHTG 
DKP F E CQDCGKAFNRGS SLVQHQS I HTGEKP YE C KE CGKAFRL Y 
LQLSQHQKTHTGEKPFECKECGKFFRRGSNLNQHRSIHTGKKPF 
ECKECGKAFRLHMHLIRHQKLHTGEKPFECKECGKAFRLHMQL I 
RHQKLHTGEKPFECKECGKVFSLPTQLNRHKNIHTGEKAS 


5468 


225 


2976 


SFLl'OliFOSIjAQLENLCKgLXErrDTTTRljQAEKALVKb^iNSPD 
CI^KCQLIjLERGSSSYSQLIjAATCLTKLVSRTNNPLPLEQRIDI 
RNYVLNYLATRP KIiATFVTQALIQLYAR ITKLGWFDCQKDDYVF 

DTTHPLTKITOKIASSFRDSSLFDIFTLSCNLLKQASGKNLNLND 
ES QHGIiLMQLLKLTHNCLNFDFIGTS TDESSDDLCTVQ I PTS WR 
SAFLDSSTLQLSTIGRCEYEKTCALLVQLFDQSAQSYQELLQSA 
S AS PMD I AVQEGRLTWLVY 1 1 GAVIGGRVS FASTDEODAMDGEli 
VCR VLQ LMNL TDS R LAQAGNE KLEIjAMLS FFEQFRK I YIGDQVQ 
KS S KL YRRLS EVLGLNDETMVLS VF I G K 1 1 TNLJCYWG RCE PITS 
KTLQL LNDLS IG Y S S WKL VKLS AVQ FMLNNHTSEH FS FLG INN 
QSNLTDMRCRTTFYTAIjGRLIjMVDLGEDEDQYEQFMLPLTAAFE 
AVAQMF STNS FNEQEAKRTLVGLVRDLRG IAFAFNAKTS FNMLF 
EW I YPS YMP I LQRAI ELWYHDPACTT P VLKLMAELVHNRSQRLQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A==Alanine ( C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine , G^Glycine, 
H=Histidine, I=Isoleucine / K=Lysine, 
L=Leucine, M-Methionine , N-Asparagine, 
P-Proline, Q=Glutamine, R-Arginine, 
S«Serine, T^Threonine , V=Valine, 
W=Tryptophan, Y^Tyrosine, X=UrUcnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FDVSS PNG I LLFRETS KMITMVGNRlLTLGE VPKDQVYALKLKG 
ISICFSKLKAALSGSYVNFGVFRLYGDDALDNALQTFIKLLLSI 
PHSDLLDYPKLSQSYYSLLEVLTQDHMNFIASliEPHVIMYILSS 
ISEGLTALDTMVCTGCCSCLDHIVrYLFKQLSRSTKKRTTPLNQ 
ESDRFLHIMQQHPEMIQQMLSTVLNIIIFEDCRNQWSMSRPLjLG 
L I LLNEKY FSDLRNS I VNS QP PE KQQAMHLC FENLMEG I ERNLL 
TKNRDRFTQNLSAFRREVNDSMKNSTYGVNSNDMMS 


5469 


134 


2653 


DQEFETSLVPWHLPMGWLCSGLLFPVSCLVLLQVASSGNMKVLQ 
EPTCVSDYMSISTCEWKMNGPTNCSTELRLLYQLVFLLSEAHTC 
VPENNGGAGCVCHLLMDDWSADNYTLDLWAGQQLLWKGSFKPS 
EHVKPRAPGNLTVHTNVSDTLLLTWSNPYPPDNYLYNHLTYAVN 
I WS ENDPAD FR I YNVTYLE PSLR I AASTLKSG I S YRARVRAWAQ 
CYNTTWSEWSPSTKWHNSYREPFEQHLLLGVSVSCIVILAVCLL 
CYVS I TK I KKEWWDQI PNPARS RLVAI I IQDAQGSQWE KRS RGQ 
E PAKCPHWKNCLTKLLPCFLEHNMICRDEDPHKAAKEMPFQGSGK 
SAWCPVE ISKTVLWPES I SWRCVELFEAPVECEEEEEVEE EKG 
SFCASPESSRDDFQEGREGIVARLTE3LFLDLLGEENGGFCQQD 
MGESCLLPPSGSTSAHMPWDEFPSAGPKEAPPWGKEQPLHLEPS 
PPASPTQSPDNLTCTETPLVIAGNPAYRSFSNSLSQSPCPRELG 
PDPLLARHLEEVEPEMPCVPQLSEPTTVPQPEPETWEQILRRNV 
LQHGAAAAP VS A P TSGYQE FVHAVEQGGTQASA WG LGPPG EAG 
YKAFSSLIASSAVSPEKCGFGASSGEEGYKPFQDLIPGCPGDPA 
PVPVPLFTFGLDREPPRSPQSSHLPSSSPEHLGLEPGEKVEDMP 
KPPLPQEQATDPLVDSLGSGXVYSALTCHLCGHLKQCHGOEDGG 
QTPVMASPCCGCCCGDRAS PPTTPLRAPDPSPGGVPLEASLCPA 
SLAPSGISEKSKSSSSFHPAPGNAQSSSQTPKIVNFVSVGPTYM 
RVS 


5470 


17 


1418 


TACR I RTSLNRG I AAVKEDA VEMLAS YGLAYS LMKFFTG PMS DF 
KNVGLV FVNS KRDRTKAVL CMWAG AI AAVFHTL I AYSDLG Y Y I 
INKLHHVDESVGSKTRRAFLYLAAFPFMDAMAWTHAGILLKHKY 
SFL VGCAS I £ D V I AQ WFVAI LLHS HLECRE PIiL I P I LS L YMG A 
LVRCTTLCLGYYKNIHDI I PDRSGPELGGDATIRKMLSFWWPLA 
LILATQR ISRP I VNLFVSRDLGGSS AATEAVAI LTAT YP VGHM P 
YGWLTE I RAVYPAFDKNNPSNKLVS TSNTVTAAH I KKFT FVCMA 
LSLTLC FVM FVf T PNVS EKI L I D 1 1 G VDFAFAELCWP L»R I FS FF 
PVP VTVRAHLTGWLMTLKKTFVLAPSS VLRI I VL IASLWLPYL 
G VHGATLGVG S LLAG FVGES TMDA I AACYVYRKQ KKKMENE S AT 
EGEDS AMTDMP P TEEVTD I VEMREENE 


5471 


1868 


658 


RSSAPPGPQRAAAATAAAAAAGVEMAAAAAQGGGGGEPRRTEGV 
GPGVPGEVEMVKGQPFDVGPRYTQLQYIGEGAYGMVSSAYDHVR 
KTRVAIKKISPFEHQTYCQRTLREIQILLRFRHENVIGIRDILR 
ASTLBAMRDVY I VQDLMETDLYKLLKSQQLSNDH I C YFLYQ ILR 
GLKYIHSANVLHRDLKPSNLLINTTCDLKICDFGLARIADPEHD 
n iv»r L 1 h x v A I KW i KAFfc, 1 MbNS KG YTKS I DI WS VGC I LAEMLS 
NR P I F PG KKYLDQhNtilLiG I LG S PSQEDLNCI INMKARNYLQ S L 
P S KT KVAWAKL FPKS DS KALDLLDRMLT FNPNKR I TVE E ALAH P 
YLEQYYDPTDEP VAEE PFTFAMELDDLP KERLKEL I FQETARFQ 
PGVLEAP 


5472 


1469 


753 


LYVMAR YLSDEE VAVS I DRLCKANGRS PS I PFGTVR I PGRAR VR 
D PQ ALW I FG YGS L VW R PDFAYS D S RVG F VRG YS RRFWQGDT FHR 

VLGGYDTKEVTFYPQDAPDQPLKALAYVATPQNPGYLGPAPEEA 
I ATQ I LACRG FSGHNLE YLLRVRDVMQ L CG PQ AQD EHLAA I VDA 
VGTMLPCFCPTEQALALV 


54 73 


3 


2119 


FMNVKLLI QDLEDI EQRVPVMDAQYKI I TKTAHLI TKES PQEEG 
KEMFATMS KL KEQL TKVKECY S PLLYESQQLLI PLBELEKQMTS 
F YDS LGK I N E 1 1 T VLE REAQS SAL FKQKHQELLACQENC KKTLT 
LIEKGSQSVQKFVTLSNVLKHFDQTRLQRQIADIHVAFQSMVKK 
TGDWFOXHVETNSRLMKKFEESRAELEKVLR IAQEGLBEKGDPEE 
LLRRHTEFFSQLDQRVLNAFLKACDELTDILPEQEQQGLQEAVR 
KLHKQWKDLQGEAPYHLLHLKIDVEKNRFLASAEECRTELDRET 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D«Aspartic Acid, E- | 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, I»Isoleucine, K-Lysine, 
L^Leucine, M-Methionine, N-Asparagine, 
P=»Proline, Q=Glutamine, R=Arginine, 
S-Serine, T=Threonine / V= Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\spossible nucleotide insertion) 








KLMPQEGSEKITKE^RVP^DKGPHHLCEKRLQLIEELCVKLPV 
RDPVRDTPGTCHVTLKELRAAIDSTYRKIiMEDPDKWiCDYTSRFS 
E FS S WI STNETQLKG I KGEAIDTANHGEVKRAVEEIRNGVTKRG 
ETLSWLKSRLKVTiTEVSSENEAQKC^DEIAKLSSSFKALVTLLS 
EVEKMLSNFGDCVQYKEIVKNSLEELISGSKEVQEQABKILDTE 
NL F E AQQUjLHHQQKTKR I S AKKRD VQQQ I AQAQQGEGGLPDRG 
HEELRKLESTLDGLERSRERQERRIQVTLRKWERFETNKETWR 
YL FQTGSS H ER FLS FS SLESLSSE LEQTKE FS KRTES I AVQAEN 
LVKEASEIPLGPQNKQLLQQQAKSIKEQVKKLEDTIjEEEYVIDK 

s 


5474 


2 


7B0 


TPDVRQLQASRRGIAVASWCSPRWFAGEEMAFVKSGWLLRQSTI 
l^WKraWFDLWSDGKLIYYDDGTRQNIEDKVHMPMDCINIRTG 
QE CRDTQPPDG KS KDCMLQ I VCRDG KT I SLCAES TDDCLAWKFT 
LQDS RTNTAYVGS AVMTDETSWS S PP P YTAYAAP APEVGRTLS 
LQQAYGYGPYGGAYPPGTQWYAANGQAYAVPYQYPYAGLYGQQ 
PANQVI IRERYRDNDSDLALGMLAGAATGMALGSLFWVF 


5475 


2 


506 


ARGWLESLSLTCQTTPPPSSPCIiLHSPETFIHTMPPNLTGYYRF 
VSQKNMEDYLQALNI SIAVRKI ALLLKPDKE I EHQGNHMTVRTL 
STFRNYTVQFDVGVEFEEDLRSVDGRKCQTIVTWEEEHLVCVQK 
GEV PNRG WRHWLEGEML Y LE LT AR DAVCEQVFR KVR 


5476 


192 


1457 


SDSMSLLDCFCTSRTQVESLRPEKQSETSIHQYLVDEPTLSWSR 
PSTRAS E VLCSTNVSHYELQVE IGRGFDNLTS VHIiARHTPTGTL 
VTI KITNLENCNEERLKALQKAVILSHF FRHPNITTYWTVFTVG 
S WLWV I S PFMAYGS ASQLLRT YFPEGMS ETLIRNI LFGAVRGLN 
YLHQNGC IHRSI KASHIL I SGDGLVTLS GLSHLHS LVKHGQRHR 
AVYDFPQFSTSVQPWLSPELLRQDLHGYNVKSDIYSVGITACEL 
ASGQVPFQDMHRTQMLLQKLKGPPYSPLDISIFPQSESRMKNSQ 
SGVDSGIGESVLVSSGTHTVNSDRLHTPSSKTFSPAFFSLVQLC 
LQQDPEKRPSASSLLSHVFFKQMKEESQDSILSLLPPAYNXPSI 
SLPPVLPWTEPECDFPDEKDSYWBF 


5477 


3 


1044 


RGNSRLR YSHEDE LQLPRLP ELFETGRQLLDE VE VATE P AGS R I 
VQEKVFKGLDLLEKAAEMLSQLDLFSRNEDLEE IAS TDLKYLLV 
P AFQG ALTMKQVN P S KRLDHLQRARE H F INYLTQ CHCYHVAE FE 
LPKTMNNSAENHTANSSMAYPSLVAMASQRQAKIQRYKQKKELE 
HRLSAMKSAVESGQADDERVREYYLLHLQRWIDI SLEE IES IDQ 
E I KI LRERD SSRE AS TS NS SRQERP P VKP F I LTRNMAQ AKVFGA 
GYPSLPTMTVSDWYEQHRKYGALPDOGIAKAAPEEFRKAAQQQE 
EQEEKEEEDDEQTLHRAREWDDWKDTHPRGYGNRQNMG 


5478 


2 


835 


KTVR I W VP NVKG ES TVFRAHTATVRS VH FCSDGQS FVTAS DD KT 
VKVWATHR Q KFLFS LS QH INW VR CAKFS PDGRLI VSASDDKTVK 
LWDKS S RE CVHS YCEHGG FVT YVD FHP SGTC IAAAGMDNTVKVW 
DVRTHRLLQHYQLHSAAVNGLSFHPSGNYLITASSDSTLKILDL 
MEGRLLYTLHGHQGPATTVAFSRTGEYFASGGSDEQVMVWKSNF 
D I GDHGEVTKVPRPPATLAS SMGNLTVS I LEQRLTLEEDKLKQC 
LENQQLIMQRATP 


5479 


2 


835 


KTVRIV^PNVKGESTVFRAHTATVRSVHFCSDGQSFVTASDDKT 
VKVWATHRQ KFLFS LSQHINWVRCAKFS P DGRLI VSASDDKTVK 
L WDKSSRE C VHS YCEH GG FVT YVDFHPS GTC IAAAGMDNTVKVW 
DVRTHRLLQHYQLHSAAVNGLSFHPSGNYLITASSDSTLKILDL 
MEGRLLYTLHGHQGPATTVAFSK x Gfc. i r AbWjb JJ£*<J VM v ti K-oin r 
DIGDHGEVTKVPRPPATLASSMGNLTVSILEQRLTLEEDKLKQC 
LENQQLIMQRATP 


5480 


444 


1952 


LSLTSRMEEAELVKGRLQAITDKRKIQEEISQKRLKIEEDKLKH 
QHLKKKALRE KWLLDG I S SGKEQEEMKKQNQQDQHQI QVLEQS I 
LRLEKEIQDLEKAELQISTKERAILKKLKS IERTTEDI IRSVKV 
EREERAEESIEDIYANIPDLPKSYIPSRLRKEINEEKEDDEQNR 
KALYAME I KVEKDLKTGESTVLS S I PLPSDDFKGTGI KVYDDGQ 
KSVYAVSSNHSAAYNGTDGLAPVEVEELLRQASERNSKSPTEYH 
EPVYANPFYRPTTPQRETVTPGPNFQERIKIKTNGLGIGVNESI 
HNMGNGLSEERGNNFNHISPIPPVPHPRSVIQQAEEKLHTPQKR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A~Alanine, C=Cysteine, D*=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K-Lyeine, 
L=Leucine f NUMethionine, N^Asparagine, 
P-Proline, Q^Glutamine, R=Arginine, 
S*Serine, T= Threonine, V=Valine, 
W -Tryptophan, Y=Tyrosine, X=Un3cnown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMTPWE E S NVMQDKDAP S P KPRL S PRET I FG KS EHQNS S PTCQ E 
DEEDVRYNIVHSLPPDINDTEPVTMIFMGYQQAEDSEEDKKFLT 
G YDGI I HAE LWI DDE EEEDEGEAEKPS YHPIAPHSQVYQPAKP 
TPLPRKRSEASPHEKHKS 


5481 


3 


1422 


NS PGSVCLCQCVCPSLLHCLPPLLLLLLLPLjLLHES PQPPALRV " 
VATS SDRNFMNKHQKP VLTGQRFKTRKRDEKEKFE PTVFRDTLV 
QGLNEAGDDLEAVAKFLDSTGSRLDYRRYADTLFDILVAGSMLA 
PGG TR I DDGDKT KMTNHCVFSANEDHETI RN YAQ VFNKL IRR YK 
YLEKAFEDEMKKLLLFLKAFSETEQTKLAMLSQIIiLGNGTLPAT 
I LTS LFTDS LVKEG I AAS F AVKL F KAWMAE KD ANS VTS S LRKAN 
IiDKRLLELFPVNRQSVDHFAKYFTDAGLKEIiSDFLRVQQSLGTR 
KELQKELQERLSQECPIKEVVLYVKEEMKRNDIiPETAVIGLLWT 
CIMNAVEl^KKEELVAEOALiKHIjlcnYfiPT.T,a\7T?QC!OfjnQT?T ttt 

QKVQEYCYDNIHFMKAFQKIVVLFYKADVLSEEAILKWYKEAHV 
AKGKSVFLDQMKKFVEWLQNAEEESESEGEEN 


5482 


1492 


528 


THWMTGMCYAPHQVLSYINGVTTSKPGVSLVYSMPSRNLSLRL 
EGLQEKDSGPYSCSVNVQDKQGKSRGHSIKTLELNVLVPPAPPS 
CRLQGVPHVGANVTLSCOSPRS KPAVQYQWDRQLP S FQTFFAPA 

GAAWAGAWGTLVGLGLLAGLVLLYHRRGKALEEPAND I KEDA 
IAPRTLPWPKSSDTISKNGTLSSVTSARALRPPHGPPRPGALTP 
TPSLSSQALPSPRLPTTDGAHPQPISPIPGGVSSSGLSRMGAVP 
VMVP AQSQAGS LV 


5483 


1 


788 


FFFFKGCRAGRGNESDYRKIjEEMHQRFLVSERSKDDLQLRLTRA 
ENRIKQLETDSSEEISRYQEMIQKLQNVLBSERENCGLVSEQRL 
KLQQENKQLRKETESLRKIALEAQKKAFCVKISTMEHEFSIKERG 
FEVQLREMEDSNRNSIVEIjRHLLATQQKAANRWKEETKKLTESA 

e i r innlkselsrqklhtqellsqlemanekvaenekl i lehqe 
kanrlqrrlsqaeeraasasqqlsvitvqrrkaaslmnleni 


5484 


3 


1997 


IMADMEDLFGSDADSEAERKDSDSGSDSDSDQENAASGSNASGS 
ESDQDERGDSGQPSNKELFGDDSEDEGASHHSGSDNHSERSDNR 
SEASERSDHEDNDPSDVDQHSGSEAPNDDEDEGHRSDGGSHHSE 
AEGSEKAHSDDEKWGREDKSDQSDDEKIQNSDDEERAQGSDEDK 
LQNSDDDEKMQNTDDEERPQLSDDERQQLSEEEKANSDDERPVA 
SDNDDEKONSDDEEOPOLSDEEKMONSDDTi'RPOAQnPPWTJUQnn 
EEEQDHKSESARGSDS EDEVLRMKR KNA IASDSEADSDTE VPKD 
NSGTMDLFGGADDISSGSDGEDKPPTPGQPVDENGLPQDQQEEE 
P IPETRIEVEI PKVNTDLGNDLYFVKLPNFLSVEPRPFDPQYYE 
DE FEDEEMLDEEGRTRLXLKVENT IRWR IRRDEEGNEI KESNAR 
IVKWSDGSMSLHLGNEVFDVYKAPLQGDHNHLFIRQGTGLQGQA 
VFKTKLTFRPHSTDSATHRKMTLSIADRCSKTQKIRILPMAGRD 
PECQRTEMIKKEEERLRAS I RRESQQRRMREKQHQRGLSAS YLE 
PDRYDEEEEGEES I SLAAI KNRYKGGIREERARI YSSDSDEGSE 
EDKAQRIxLKAKiaTSDEVRPNLFNSRGLSCTQEPTALNEELTDQ 
AGTN 


5485 


161 


1074 


KRKILSSMMDSEAHEKRPPILTSSKQDISPHITNVGEMKHYLCG 
CCAAFNNVAI T FP I QKVL FRQQL YG I KTRDAI LQLRRDG FRNL Y 
RGILPPLMQKTTTLALMFGLYEDLSCLLHKHVSAPEFATSGVAA 
VLAGTTEAI FTP LERVQTLU3DHKHHDKFTNTYQAFKALKCHGI 
GEYYRGLVPILFRNGLSNVLFFGLRGPIKEHLPTATTHSAHLVN 
DFICGGLLGAMLGFIjFFPINWKTRIQSQIGGEFQS FPKVFQKI 
WLERDRKLINLFRGAHLNYHRSLISWGI I NATYE FLLKV I 


5486 


1404 


142 


IPGSTISWSPAAARGLSVCRCCRLHPASAMDLFGDLPEPERSPR"" 
PAAGKEAQKGPLLFDDLPPASSTDSGSGGPLLFDDLPPASSGDS 
GS LATS I S QMVKTEG KGAKR KTS EEE KNGS EELVE KKVCKAS S V 
IFGLKGYVAERKGEREEMQDAHVILNDITEECRPPSSLITRVSY 
FAVFDGHGGIRASKFAAQNLHQNLIRKFPKGDVISVEKTVKRCL 
LDTFKHTDEEFLKQASSQKPAWKDGSTATCVLAVDNrLYIANLG 
DSRAILCRYNEESQKHAALSLSKEHNPTQYEERMRIQKAGGNVR 
DGRVLG VLEVSRS IGDGQ YKRCGVTS VPD I RRCQLTPNDRF I LL 
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SEQ 

tt> 
xu 

NO: 


Predicted 
beginning 
nucleotide 
location 

corrp ^nnnH "i ncr 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C»Cysteine, D=Aspartic Acid, Ess 
Glutamic Acid, F* Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 

T i— T iPiir i np M-MAt*}ii rinSno M-Scnar-am' 
Jj-.UwUl.AllC, i'J -lie UUlUlilUC/ IM — nSparaQlllS ) 

P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V -Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *s=Stop 
Codon, /=possible nucleotide deletion, 
\=po5sible nucleotide insertion) 








ACDGLFKVFTPEEAVNFILSCLEDEKIQTREGKSAADARYEAAC 
NRTiANKAVORG^ AIWVTVMVVR TfJH 


5487 


535 


182 


AVSLEQIRGLQTPAPVPLPLQPCPSNCDMERVTLALLIiLAGLTA 
LEANDPFANKDDPFYYDWKNLQLSGLI CGGLLAI AG IAAVLSG K 
CKCKSSQKQHSPVPEKAIPLITPGSATTC 


5468 


1072 


259 


AMAASGEPQRQWQEEVAAWWGSCMTDLVSLTSRLPKTGETIH 
GHKF F I GFGGKGANQ C VQAARLGAMTSMVCKVGKDS FGND YI EN 
LKQNDISTEFTYQTKDAATGTASI IVNNEGQNI I VI VAGANLLL 
NTED LRAAANV I S RAKVM V CQLE I T PATS LE ALTMARRSGVKTL 
FNPAPAIADLDPQFYTLSDVFCCNESEAEILTGLTVGSAADAGE 
AALVLLKRGCQWI ITLGAEGCWLSQTEPE PKHI PTEKYKAVD 
TTVSFKI 


5489 


81 


893 


GKGPVAAFIDQSNIFLTDPKIFLGQWREEPKMPLLLLGETEPLK 
LERDCRSPVEPWAAAS PDLALACLCHCQDLS SGAFPNRGVLGGV 
LFPTVEMVI KVFVATSSGS I AIRKKQQEWGFLEANKIDFKELD 
IAGDEDNRRWMRENVPGEKKPQNGIPLPPQI FNEEQYCGDFDSF 
FSAKEENIIYSFLGLAPPPDSKGSEKAEEGGETEAQKEGSEDVG 
NLPEAQEKNEEEGETATEETEEIAMEGAEGEAEEEEETAEGEEP 
GEDEDS 


5490 


81 


893 


GKGPVAAFIDQSNIFLTDPKI FLGQWREEPKMPLLLLGETEPLK 
LERDCRSPVEPWAAAS PDLALACLCHCQDLS SGAFPNRGVLGGV 
LFPTVEMVI KVFVATSSGS I AI RXKQQE WGFLEANKIDFKELD 
IAGDEDNRRWMRENVPGEKKPQNGIPLPPQIFNEEQYCGDFDSF 
FS AKE EN 1 1 YS F LG LAP P PDS KGSE KAE EGG ETE AQ KEGS ED VG 
NLPEAQEKNEEEGETATEETEE IAMEGAEGEAEE EEETAEGEEP 
GEDEDS 


5491 


204 


1194 


GSAPRLSLGPTGAQARDPDWWARPPSRPYTQSKEDRPDTEGRSE 
QGDMASSFLPAGAITGDSGGELSSGDDSGEVEFPHSPEIEETSC 
LAE LF EKAAAHLQGL I QVAS RE Q LL YL Y ARYKQ VKVGNCNT P K P 
SFFDFEGKQKWEAWKALGDSSPSQAMQEYIAWKKLDPGWNPQI 
PEKKGKEANTGFGGPVISSLYHEETIREEDKNIFDYCRENNIDH 
ITKAI KS KNVDVNVTCDEEGRALLHWACDRGHKELVTVLLQHRAD 
I NCQDNEGQ TALK YAS ACE FLD I VEL LLQS GAD PTLRDQDGCL P 
EEVTGCKTVSLVLQRHTTGKA 


5492 


3 


1896 


AS KN PLS AVCTTG I MS SLAVRDPAMDRS LRS VFVGN I P YEATEE 
QLKDI FSEVGSWS FRLVYDRETGKPKG YGFCEYQDQETALSAM 
RNLNGRE FS GRALRVDNAAS EKNKEEL KS LG PAAP 1 1 DSPYGDP 
IDPEDAPESITRAVASLPPEQMFELMKQMKLCVQNSHQEARNML 
LQNPQLAYALLQAQWMRIMDPE IALKILHRKIHVTPL I PGKSQ 
SVSVSGPGPGPGPGLCPGPNVLLNQQNPPAPQPQHLARRPVKDI 
P PLMQTP I QGGI PAPGP I PAAVPGAGPGSLTPGGAMQ PQLGMPG 
VG P VPLERGQVQMSDPRAP I PRG P VTPGGLP PRGLLGDAPNDPR 
GGTLIjS VTGEVEPRGYLGPPHQGPPMHHASGHDTRGPS SHEMRG 
GPLGDPRLLIGEPRGPMIDQRGLPMDGRGGRDSRAMETRAMETE 
VLErRVMERRGMETCAMETRGMEARGMDARGLEMRGPVPSSRGP 
MTGG IQG PGP INIGAGGPPQGPRQ VPGI SGVGNPGAGMQGTG I Q 
GTGMQGAG I QGGGMQG AG I QGVS I QGGG I QGGG IQGAS KQGG S Q 
PSSFSPGQSQVTPQDQEKAALIMQVLQLTADQIAMLPPEQRQSI 
LILKEQIQKSTGAS 


5493 


1 


1876 


RAPMMTKAVPEE PRKPGRT iTOAT ,N «5 PT .TWEHVW TCVPOnT PnPT. 
TDTFRVKRPHLRRSASNGH VPGTPVYREKEDMYDEI I ELKKSLH 
VQKS D VDLMRTKLRRLEE ENSRKDRQ I EQLLDP S RGTD FVRTLA 
EKRPDASWINGLKQRILKLEQQCKEKDGTISKIjQTDMKTTNLE 
EMRIAMETYYEEVHRLQTLLASSETTGKKPIjGEKKTGAKRQKKM 
GSALLSLSRSVQELTEENQSLKEDLDRVLSTSPTISKTQGYVEW 
SKPRLLRRIVELEKKLSVMESSKSHAAEPVRSHPPACLASSSAL 
HRQPRGDRNKDHERLRGAVRDLKEERTALQEQLLQRDLEVKQLL 
QAKADLEKELECAREGEEERREREEVLREEIQTLTSKLQELQEM 
KKEEKSDCPEVPHKAQELPAPTPSSRHCEQDWPPDSSEEGLPRP 

rs p csdgrrdaaarvlqaqwkvykhkkkkavldeaavvlqaa.fr 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, OCysteine, D=Aspartic Acid, E* 
Glutamic Acid, F«Phenylalanine , G=Glycine, 
H^Histidine, I=Isoleucine, K=»Lysine, 
L=Leucine, M»Methionine, N=Asparagine , 
P=Proline, Q«Glutamine, R-Arginine, 
S=Serine, T=Threonine, V-Valine, 
W-=Tryptophan, Y-Tyrosine, X=Unknown, *s=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GHLTRTKLLASKAHGSEPPSVPGLPDQSSPVPRVPSPIAQAtGS" 
PVQEEAI VI IQSALRAHLARARHSATGKRTTTAASTRRRS ASAT 
HGDASSPPFIAALPDPSPSGPQAVAPLPGDDVNSDDSDDIVIAP 
SLPTKNFPV 


5494 


71 


536 


RS KAKI G T PTREVPSTDMKVRRE S SS SLTHR PAPS P AT PRLLGT " 
RRVLLGVS EGTGCADAMELVLVFLCS LLAPMVLASAAE KEKEMD 
PFHYDYQTLRIGGLVFAWLFSVGILLILSRRCKCSFNQKPRAP 
GD E EAQ VENL I TANATE PQKAEN 


5495 


273 


2168 


DSLLLIQVDTMPFTLHLRSRLPSAIRSLILQKKPNIRNTSSMAG 
ELRPASLWLPRSLAPAFERFCQVNTGPLPLLGQSEPEKWMLPP 
QG AI S ETRMGHPQFWK YE FGA CTGS LASLEQ YS EQL KDMVAFFL 
G CS FS LEEALE KAGLPRRD PAGHS QAGAYKTTVPCVTHAG FCCP 

TjWTMRPTPKTivt ,t?r2T .\ro arrci /irrnr!r\mftiMripir«?T t /-< t t^t** 
jjv v ii'mr aitiujjvjjci^JjVKAuLoJjUnjEUGQ 

S KP A YGDAMVCPPGEVP VFWPS PLTSLGA VSS CETPLAFAS I PG 
CTVMTDLKDAKAPPGCLTPERIPEVHHISQDPLHYSIASVSASQ 
KIRELESMIGIDPGNRGIGHLLCKDELLKASLSLSHARSVLITT 
GFPTHFNHEPPEETDGPPGAVALVAFLQALEKEVAIIVDQRAWN 
LHQKI VEDAVEQGVLKTQ I P I LTYQGGS VEAAQAFLCKNGDPQT 
PRFDHLVAIEPAGRAADGNYYNARKMNIKHLVDPIDDLFLAAKK 
I PG I S STGVGDGGNELGMGKVKEAVRRHIRHGDVI ACDVEADFA 
VIAGVSNWGGYALACALYILYSCAVHSQYLRKAVGPSRAPGDQA 
WTQALPSVIKEEKXLGILVQHKVRSGVSGIVGMEVDGLPFHNTH 
AEMI QKL VDVTTAQV 


5496 


3 


2408 


QD T KMHE I YKGN I T PQLN KNTLKTS AATDVWAVY FSQFW I D YEG 
MKSGKGRPISFVDSFPLSIWICQPTRYAESQKEPQTCNQVSLNT 
SQSESSDLAGRLKRKKLLKEYYSTESEPLTNGGQKPSSSDTFFR 
FSPSSSEADIHLLVHVHKHVSMQINHYQYLLLLFLHESLILLSE 
NLRKDVEAVTGS PASQTS I C I G I LLRS AELALLLHPVDQANTLK 
SPVSES VS P WPD YLPTENGDFLSS KR KQ ISRDINRI RS VTVNH 
MSDNRSMSVDLSHIPLKDPLLFKSASDTNLQKGISFMDYLSDKH 
LGKISEDESSGLVYKSGSGEIGSETSDKKDSFYTDSSSVLNYRE 

-ajar l-> j L J_o o i i_i j. o ISXjv* ti J. J. Aol fr ft AKi/I W iPPA/\.s n 

SENLDISKEETPPVRTLKSQSSLSGKPKERCPPNLAPLCVSYKN 
MKRSSSQMSLDTISLDSM1LEEQLLBSDGSDSHMFLEKGNKKNS 
TTNYRGTAES VNAGANLQNYGETS PDAI S TNSEGAQENHDDLMS 
VWFKI TGVNGE I D I RGEDTE I CLQVNQVTPDQLGN I S LRHYLC 
NRP VGS DQ KAV IHSKSSPEISLR FESG PGAVIHS LLAE KNG FLQ 
CHIKNFSTEFLTSSLMNIQHFLEDETVATVMPWKIQVSNTKINL 
KDDS PRS S TVS LE P AP VTVH I DHLWERS DDGS FH I RDSHMLNT 
GNDLKENVKS DSVLLTSGKYDLKKQRS VTQATQTS PGVP WPSQS 
ANFPEFSFDFTREQLMEENBSLKQEIiAKAKMAIAEAHLEKDALIi 
HHIKKMTVE 


5497 


1821 


3308 


SISKLLKRRSNIDAYLLSNSCAFFAPRLFSLASQ1IREQQSPNV ' 
CFIYTCYSGFPSI^COCHFVSPHSSr^TMPPQPDDDPwr"pnT cm 

GFSHYSLSSESHVGPTGAGLFPHCLPASRLLPRVTSVHLPDYAH 
YYTIGPGMFPSSQIPSWKDWAKPGPYDQPLVNTLQRRKEKREPD 
PNGGGPTTASGPPAAAEEAQRPRSMTVSAATRPGEEMEACEELA 
LALSRGLQLDTQRSSRDSLQCSSGYSTQTTTPCCSEDTIPSQVS 
DYDYFSVSGDQEADQQEFDKSSTIPRNSDISQSYRRMFQAKRPA 
S TAGLPTTLGPAMVTPGVATIRRTPSTKPS VRRGTIGAGP IP IK 
TPVI PVKTPTVPDLPGVLPAPPDGPEERGEHSPES PS VGEGPQG 
VTSMPSSMWSGQASVNPPLPGPKPSIPEEHRQAIPESEAEDQER 
EPPSATVSPGQIPESDPADLSPRDrPQGEDMtiNAIRRGVKLKKT 
TTNDRSAPRFS 


5498 


2434 


1492 


I LTHQEIFTGEKPCECGKAS IQMSHLSQQKI YSGENPFACKVCG 
KVFSHKSNLTEHEHFHTREKPFECNECGKAFSQKQYVIKHQNTH 
TGEKLFECNECGKSFSQKENLLTHQKIHTGEKPFECKDCGKAFI 
QKSNLIRHQRTHTGEKPFVCKECGKTFSGKSNLTEHEKIHIGEK 
P FKCS ECGTAFGQKKYL I KHQNI HTG EKP YE CNECGKAFS QRTS 
LIVHVRIHSGDKPYECNVCGKAFSQSSSLTVHVRSHTGEKPYGC 
NECGKAFSQFSTLALHLRI HTGKKP YOCS ECGKAFSQKSHHI RH 
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Amino acid segment containing signal peptide 
(A^Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
L=Leucine, M=Methionine , N*Asparagine, 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QKIHTH 


5499 


324 


926 


GFGQIGRGHKITTYPFSPRKSGRKGMAQSQGWVKRYIKAFCKGF 
FVAVPVAVTFLDRVACVARVEGASMQPSLNPGGSQSSDVVLLiNH 
WKVRNFEVHRGDI VSLVS P KNPEQKI I KRVIALEGDIVRTIGHK 
NRYVKVPRGH1WVEGDHHGHSFDSNSFGPVSLGLLHAHATHILW 
PPERWQKLESVLPPERLPVQREEE 


5500 


1978 


1286 


KPD WRLQNLP PRL Y LWRS S R FGFG HLKKRLQMD F K I EHT W DGF P 
VKHEPVFIRLNPGDRGVMMDISAPFFRDPPAPLGEPGKPFNELW 
DYEWE AFFLND I TEQ YLE VELC PHGQHLVLLLS GRRNVWKQEL 
PLSFRVSRGETKWEGKAYLPWSYFPPNVTKFNSFAIHGSKDKRS 
YEALYPVPQHBLQQGQKPDFHCLEYFKSFNFNTLLGEEWKQPES 
DLWLIEKCDI 


5501 


2927 


2226 


CRP P VSARVAPGHQGAVGGSGRRPARVE WDAAARP SSRP FSLP 
AA1MLAL I SRLLDWFRSLFWKEEMELTL VGLQYS GKTTFVNVIA 
SGQFSEDMIPTVGFNMRKVTKGNVTIKIWDIGGQPRFRSMWERY 
CRGVNAIVYMIDAADREK1EASRNELHNLLDKPQLQGIPVLVLG 
NKRDLPNALDEKQL I EKMNLSAIQDRE I CCYS IS CKEKDN I DI T 
LQWL I QHS KSRRS 


5502 


3 


824 


NSAFP VWVPERTALLTCPLGAAPGS S REAPG IAG P PNSTAMSKL 
GKFFKGGGSSKSRAAPSPQEALVRLRETEEMLGKKQEYLENRIQ 
REIALAKKHGTQNKRAALQALKRKKRFEKQLTQIDGTIjSTIEFQ 
R EAL ENSHTNTEVLRNMG FAAKAMKS VHENMDLNK I DDLMQ E I T 
EQQD I AQE I S EAFS QR VG FGDDFDE DELMAE IiE E LEQEE lnkkm 
TNIRLPNVPSSSLPAQPNRKPGMSSTARRSRAASSQRAEEEDDD 
I KQLAAWAT 


5503 


216 . 


654 


KGVRRRGRVRSDSEDSHLGYFKMSFLLPKLTSKKEVDQAIKSTA 
EKVLVLRFGRDEDPVCLQLDDILSKTSSDLSKMAAIYLVDVDQT 
AVYTQ YFDI S Y I PSTVFFFNGQHMKVDYGGEDPALRS IKAVRRT 
SPAGTLGEKPVNS 


5504 


58 


3S63 


QLSFSFQAPVTFDDITVYLLQEEWVLLSQQQKELCGSNKLVAPL 
GPTVANPELFRKFGRGPEPWLGSVQGQRSLLEHHPGKKQMGYMG 
EMEVQGPTRESGQSLPPQKKAYLSHLSTGSGHIEGDWAGRNRKL 
LKPRSIQKSWFVQFPWLIMNEEQTALFCSACREYPSIRDKRSRL 
I EGYTGPFKVETLKYHAKS KAHMFCVNALAARDP I WAARFRS IR 
DPPGDVLAS PEPLFTADCP I FYPPGPLGGFDSMAELLPSSRAEL 
EDPGGDGA I PAMYLDCISDLRQKEI TDG IHS SSD INI LYNDA VE 
SCIQDPSAEGLSEEVPWFEELPWFEDVAVYFTREEWGMLDKR 
QKEL YRD VMRMNYE LLASLGP AAAKP DL I S KLERRAAPW I KD PN 
GPKWGKGRPPGNKKMVAVREADTQASAADSALLPGSPVEARASC 
CSSSICEEGDGPRRIKRTYRPRSIQRSWFGQFPWLVIDPKETKL 
FCSACIERPNLHDKSSRLVRGYTGPFKVETLKYHEVSKAHRLCV 
NTVE I KEDTPHTALVPE ISSDLMANMEHFFNAAYS I AYHSRPLN 
DFEKILQLLQSTGTVILGKYRNRTACTQFIKYISETLKREILED 
VRNSPCVSVLLDSSTDASEQACVGIYIRYFKQMEVKESYITLAP 
LYSETADGYFETIVSALDELDIPFRKPGWWGLGTDGSAMLSCR 
GGLVE KFQEVI PQLLPVHCVAHRLHLAWDACGS I DLVKKCDRH 
IRTVFKFYQSSNKRLNELQEGAAPIiEQEI I RL KD LNAVR WAS R 
RRTLHALLVSWPALARHLQRVAEAGGQIGHRAKGMLKLMRGFHF 
VKFCHFLLDFLSIYRPLSEVCQKEIVLITEVNATLGRAYVALES 
LRHQAGPKEEEFNASFKDGRLHGI CLDKLEVAEQRFQADRERTV 
LTGIEYLQQRFDADRPPQLKNMEVFDTMAWPSGIEIiASFGNDDI 
LNLARYFECSLPTGYSEEALLEEWLGLKTIAQHLPFSMLCKNAL 
AQHCRFPLLS KL.MAVWCVP I STS CCERGFKAMNR IRTDERTKL 
SNEVLNMLMMTAVNGVAVTEYDPQPAIQHWYLTSSGRRFSHVYT 
CAQVPARSPASARLRKEEMGALYVEEPRTQKPPILPSREAAEVL 
KDCIMEPPERLLYPHTSQEAPGMS 


5505 


3312 


1219 


NCS PRS LSAAXMSNRNNNKLPSNLPQLQNL I KRDPPAY I EEFLQ 
QYNH YKSNVE I FKLQPNKPS KELAEL VMFMAQ ISHC YPE YLSNF 
P^EVKDLLSCMTV^DPDLRMTFCKALILIjRNKNLINPSSLLEL 
FFELFRCHDKLLRKTLYTHIVTDIKNINAKHKITNK^NVVLQNFM 
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amino acid 
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amino acid 
sequence 


Predicted end 
nucleotide 
location 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CeCysteine , D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine , G**Glycine, 
H=Histidine, I-Isoleucine, X-Lysine, 
L-Leucine, M^Methionine, N«Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y«Tyrosine, X«Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\*possible nucleotide insertion) 








YtMLfeDSNATAAKMSLDVMIELYRRNIWNDAKTVNVITTACPSK 
VTK I L VAAL TF FLG KDED E KQDSD S ESEDDG PTARDIiLVQYATG 
KKSS KNKKKLEKAMKVLKKHRKKKXPEVFNFS AIHLIHDPQDFA 
EKLLKQLECCKERFE\nCMMLM^ISRLVGIHELFLFNFYPFLQR 
FLQPHQREVTKIUjFAAQASHHLVPPEIIQSIjLMTVANNFVTDK 
NSGE VMTVG INAI KB I TARCPIAMTEELLQDLAQYKTHKDKNVM 
MSARTL IHL FRTLNPQMLQKKFRGKPTEAS I EARVQEYGELDAK 
DYIPGAEVLEVEKEENAENDEDGWESTSLSEEEDADGEWIDVQH 
s SDEEQQE I S KKLNSMPMEERKAKAAAI STSRVLTQEDFQKIRM 
AQMRKELDAAPGKSQKRKYIEIDSDEEPRGELLSLRDIERLHKK 
PKS DKETRLATAMAGKTDRKEFVRKKTKTN PFS S STNKEKKKQK 
N FMMMR YSQNVRS KNKRS FRJSKQLAIjRDALLKKKKRMK 


5506 


1 


1531 


FRGDLCGQRGGSAPGEGGSSAWPAPAHPLPEREREREALCPGRS 
CSGGGGEETPGTTPVWSPLEGGGDEELRPNPYVRFPYRWWAVW 
LAAFPS LGAGGETPEAP P ES WTQLWFFRPWNAAGYAS FMVPGY 
LL VQY FRRKNYLETGRGLC FPLVKAC VFGNE P KASD E VPLAPRT 
EAAETTPMWQALKLLFCATGLQVSYLTWGVLQERVMTRSYGATA 
TS PGE R FTD S Q FL VLMNR VIiAL I VAGLS CVLCKQPRHGAPMYRY 
S FAS LSNVLS S WCQYEALKFVS FPTQVLAKAS KVI PVMLMGKLV 
SRRSYEHWEYI/TATLISIGVSMFLLSSGPEPRSSPATTLSGLIL 
LAGYIAFDSFTSNWQDALFAYKMSSVQMMFGVNFFSCLFTVGSL 
LE QG ALLEG TR FMGRHS E FAAHALLLS I CS ACG QLFI FYT I GQF 
GAAVFT I IMTLRQAFAI LLS CLLYGHT VTWGGLGVAWFAALiL 
LRVYARGRLKQRGKKAVPVES PVQKV 


5507 


3704 


1271 


" PRGTRRCRPAGRASRRARRRPPCPGPAAPGSLE I GGFGTAAG KK 
VA VAD VQ FG PMRFHQDQLQ VLLVFTKEDNQ CNG FCRACE KAG FK 
CTVTKEAQAVLACFLDKHHD I I I I DHRNPRQLDAEALCRS I RSS 
KLS ENTVI VGWRRVDREELS VMPF I SAGFTRRYVENPNIMACY 
NELLQLEFGEVRS QLKLRACNSVFTALENS EDAI EITSEDRFIQ 
YANPAFETTMGYQSGELIGKELGEVPINEKKADLLDTINSCIRI 
GKEWQG I YYAKKKNGDN I QQNVKI I P VI GQGGK I RHYVS 1 1 RVC 
NGNN KAE K IS E CVQS DTHT DNQTGKH KDRRKG S LD VKAVASRAT 
EVSSQRRHSSMARIHSMTIEAPITKVINIINAAQESSPMPVTEA 
LDRVLE ILRTTEL YS PQFGAXDDDPHANDIiVGGLMSDGLRRLSG 
NE Y VLS TKNTQ MVS SNIITPIS LDD VP PRI ARAM ENEEYWD FD I 
FELEAATHNRPLIYLGLKMFARFGICEFLHCSESTLRSWLQIIE 
ANYHS SNP YHNSTHS ADVLHATAYFLS KER I KETLDPI DE VAAL 
IAATIHDVDHPGRTNSFLCNAGSELAILYNDTAVLESHHAALAF 
QLTTGDDKCNI FKNMERND YRTLRQG I IDMVLATEMTKHFEHVN 
KF VNS I N KPLATLE ENGETDKNQE V I NTMLRTP ENRTLI KRMLI 
KCADVSNPCRPIiQYCIEWAARISEEYFSQTDEEKQQGLPWMPV 

Yr* r> nuT'r^ C T DVCftT C PTFIVTrTTTlMTrnBliinR ClAnr DTIT MHUT FtMTklC 

FDRNTOtiiriwU-ii'f IJJXr J. iiJrlr UAWUAT WULicULim^tiulJaNc 
KYWKGX.DEMKLRNLRPPPE 


5508 


1151 


691 


"lSSvfS rrs aS MfaVgcSmg p FLHYWYLS LDRLF PASGLRG FPN 
VIiKKVLVDQLVAS PLLGVWYFLGIjGCIjEGOTVGESCQELREKFW 
EFYKADWCVWPAAQF\/NFLFVPPQFRVTYINGLTLGWDTYLSYL 
KYRSPVPLTPPGCVALDTRAD 


■ 5509 


1238 


619 


RKSRGCQNALSASGPAAAAAAIMVRKLKFHEQKLLKQVDFLNWE 
VTDHNLHELRVLRRYRLQRREDYTRYNQLSRAVRELARRLRDLP 
ERDQFRVRASAALLDKLYALGLVPTRGSLELCDFVTASSFCRRR 
t omrr t vt TJMai^uT.nazvuttVVprw^wvrRVftPn\A/ r r , riP2i'PT 1 vTP c !M 

Xji? 1 V 1-jLj JVLji<.> v 1MI w J rl lj^ >\>\ V/\ r v tl» y^jri vj\ • urUV v i urnruv i XvO 1 1 

EDFVTWVDSS KI KRHVLEYNEERDDFDLEA 


5510 


96 


1195 


PAGAHLSSGSSEPLVEPGRGRVGARVKGERGLQASGSAPGRSKM 
AEGERQPPPDSSEEAPPATQNFIIPKKEIHTVPDMGKWKRSQAY 
AD Y I G F I LT LNEGVKG KKLT FE YRVS EAI E KLVALLNTLDRW ID 
ETPPVDQPSRFGNKAYRTWYAKLDEEAENLVATWPTHLAAAVP 
EVAVYLKESVGNSTRIDYGTGHEAAFAAFLCCLCKIGVIjRVDDQ 
IAIVTKVFNRYIiEVMRKIjQKTYRMEPAGSQGVWGLDDFQFLPFI 
WGSSQLI DHPYLE PRHFVDE KAVNENHKD YMFLE CI LF I TEMKT 
GPFAEHSNQLWNISAVPSWSKVNQGLIRMYKAECLEKFPVIQHF 
KFGSLLPIHPVTSG 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine , K^Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T-Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5511 


276 


1980 


KLSRVLNLP PENLI TS IS AVP I SQKE EVADFQLS VDS LLEKDND 
HSRPDIQVQAKRLAEKLRCDTWSEISTGQRTVNFKINRELLTK 

HLR ST 1 1 GKFI ANLKE ALGHQ V I RI NYLGDWGMQFGLLGTG FQL 
FGYEEKLQSNPLQHLFEVYVQVNKEAADDKSVAKAAQEFFQRLE 
LGDVQALS LWQKFRDLS I EE Y I RVYKRLGVYFDE YSGES F YREK 
SQEVLKLhESKGLLLKTI KGTAWDLSGNGDPSS ICTVMRSDGT 
S LY ATRDLAAAI DRMD KYN FDTM I YVTD KGQKKH FQQ VFQM LKI 
MG YDWAERCQHVP FGWQGN KTRRGDVTFLEDVLNE IQLRMLQN 

MA C T V r y r VY p T , ITfJ DnPTfi CD VP 1 T jxr T TAHCI/p t t t onvwn<-ir.iriTi 

i uwim i. r^KiUt\jNF i>\r*KvijijA>vbl X\lur 1\1jLiJjL»oDYKFSWDR 
VFQSRGDTGVFLQYTHARLHS LE ET FGCG YLNDFNTACLQE PQS 
VSILQHLLRFDEVLYKSSQDFOPRHIVSYLLTLSHLAAVAHKTL 
QI KDSPPEVAGARLHLFKAVRSVLANGMKLLG I TPVCRM 


5512 


120 


1015 


DPS LLLT I TVTG VTVL VL VLKS MN S RRRE P I TLQD PEAKY PL PL 
IEKEKISHNTRRFRFGLPSPDHVLGLPVGNYVQLLAKIDNELW 
RAYTPVSSDDDRGFVDL 1 1 KI YFKNVHPQYPEGGKMTQ YLENMK 

TflETTP'FI?nP'Pf2RT\FVT-Tf2DrtNrT.rtT'D D(V>Tepnvv»PT tvt^ttt r>iut t » 
x\scti xrrRUrfloKijr I rlU r\i£\ IAj X K fLXJ 1 btr K.K 1 LiADHIjGM J. A 

GGTGITPMLQIjIRHITKDPSDRTRMSLIFANQTEEDILVRKELE 
EIARTHPDQFDLWYTLDRPPIGWKYSSGFVTADMIKEHLPPPAK 
STLILVCGPPPLIQTAAHPNLEKLGYTQDMI FTY 


5513 


2 


837 


ARWRLPSDSPRIPPAGAETPGRGSCRNYLPSSSPPPPEPSSFPS 
PPTSRGGPGSRDTMSDSEEESQDRQLKIWLGDGASGKTSLTTC 
FAQETFGKQYKQTIGLDFFLRRITLPGNLNVTLQIWDIGGQTIG 

TQPLVALVGNKI DLEHMRT I KPEKHLRFCQENGFSSHFVS AKTG 
DS VFLCFQKVAAE ILG I KLNKAE I EQSQR WKAD IVNYNQEPMS 
RTVNPPRSSMCAVQ 


5514 


1295 


449 


VNRPS W IMGN FRGHALPGTFFFI IGLWWCTKS I LKYI CKKQKRT 
CYLGS KTLFYRLEILEG I TI VGMALTGMAGEQF I PGG PHLMLYD 
YKQGHWNQLLGWHHFTMYFFFGLLGVADI LCFTI SSLPVS LTKL 
MLSNALFVEAF I FYNHTHGREMLDI FVHQLLVLWFIiTGLVAFL 
EFLVRNNVLLELLRSSLILLQGSWFFQIGFVLYPPSGGPAWDLM 
DHENI LFLTI CFCWHYAVTIVIVGMNYAFITWLVKSRLKRLCS S 
a v v> u jj i\iH/\iL ivc* y Ci o jii c< is pi 


5515 


1572 


260 


FVRLVGRGDCDPLLSVCLTTMPLYEGLGSGGEKTAWIDLGEAF 
TKCGFAGETGPRCIIPSVIKRAGMPKPVRWQYNINTEELYSYL 
KEFIHI L YFRHLLVNPRDRRWT IBS VLCPSHFRETLTRVLFKY 
FEVPSVLLAPSHLMALLTLGINSAMVLDCGYRESLVDPIYEGIP 
Vl^CWGALPLGGKALHKELETQIiLEQCTVDTSVAKEQSLPSVMG 

NVD YP LDGEK I LH I LGS I RDS WEI LFE QDNEEQSVATL I LDSL 
I QC P I DTRKQLAENL W I GGT S MLPGFLHRLLAE I R YLVE KP KY 
KKALGTKTFRIHTPPAKANCVAWLGGAIFGALQDILGSRSVSKE 
YYNQTGRIPDWCSLNNPPLEMMFDVGKTQPPLMKRAFSTEK 


5516 


3 


735 


NSREP PQAGPGPS PRKS PTASSFDFPffRPIiASSFWMGAQGAQES 
I KAMWRVPGTTRRPVTGESPGMHRPEAMIiLLLTIxALIiGGPTWAG 
KMYGPGGGKYFSTTEDYDHEITGLRVSVGLLLVKSVQVfCLGDSW 
D VKLG ALGGNTQE VTLQPGE Y I TKVF VAFQAFLRGMVM YTS KDR 
Y FYFGKLDGQISS AYPSQEGQVLVGI YGQYQLLG I KS IGFEWNY 
PLEEPTTEPPVNLTYSANSPVGR 


5517 


246 


499 


SE I YVAMRTDSS KMTDVESG VANFASSARAGRRNALPD IQSSAA ™ 
TDGTSDLPLKLEALSVKEDAKEKDEKTTQDQLEKPQNEEK 


5518 


3 


1375 


DAWADAWVRAWDLNMDFPCLWLGLLLPLVAALDFNYHRQEGMEA 
FLKTVAQNYS SVTHLHS I GKS VKGRNLWVLWGRFPKEHRIG I P 
EFKWA1WHGDETVGRELLLHLIDYLVTSDGKDPEITNLINSTR 
I H I MPSMNPDGFEAVKKPDCYYS IGRENYNQYDLNRNFPDAFE Y 
NNV S RQPETVAVMKWLKTETFVLSANLHGGALVAS YPFDNGVQA 
TGALYSRSLTPDDDVFQYIjAHTYASRNPNMKKGDECKNKMNFPN 
GVTNGYSWYPLQGGMQDYNYIWAQCFEITLELSCCKYPREEKLP 
S FWNNNKASL IEYI KQVHIjGVKGQVFDQNGNPLPNVI VEVQDRK 
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SEQ 
ID 
NO : 


Predicted 
beginning 

t-vi 1 Aril" i r^rt 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

±OCCtUJ,DIl 

corresponding 
to first 

residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E* 
uJLutainj.c rtciu, r —.trricriyi alanine r \a— Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 

S=Serine, T=Threonine , V=Valine, 
W~Tryptophan , Y»Tyrosine, X -Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








HICPYRTNKYGEYYLLLLPGSYIINVTVPGHDPHITKVI I PEKS 
QNFSALKKDILLPFQGQLDSIPVSNPSCPMIPLYRNLPDHSAAT 


5519 


87 


477 


I KS KLNQQVEVQE S E W RLTEAKG P TMGKE SGWDSG RAAVAAWG 
GVVAVGTVLVALS AMGFTS VG IAASS IAAKMMSTAAI ANGGGVA 


5520 


117 


943 


P TEGRQKVLKTFTVP R S ALAMT KTS TC I YHFL VLS W YT FLNY Y I 
S QEGKDEVKPKILANGARWKYMTLLNLLLQTI FYGVTCLDD VLK 
RTKGGKDIKFLTAFRDLLFTTLAFPVSTFVFLAFWILFLYNRDL 
I YPKVLDTVI PVWLNHAMHTF1 FPITLAEWLRPHSYPSKKTGL 
i lltaaas 1ay x sr i lwlyfetgtwvyp v faklsllglaaffsls 
YVFIASIYLLGEKLNHWKWVSVQILQRWRIjESVGICFQWPDWKS 
PAKHQLVKNIR 


5521 


546 


911 


kilnmqksceenegkpqnmpkaeedrpledvpqeaegnpqpsee 
gvsqeaegnprggpnqpgqgfkedtpvrhldpeemirgvdeler 
lreeirrvrnkfvmmhwkqrhsrsrpypvcfrp 


5522 


1224 


637 


gsrplgqrsrekmwvfgygsliwkvdfpyqdklvgyitnysrrf 
wqgstdhrgvpgkpgrwtlvedpagcvwgvayrlpvgkeeevk 
ayldfrekggyrtttvifypkdpttkpfsvllyigtcdnpdylg 
paplediaeqi fnaagpsgrnteylfelans irnlvpeeadehl 

FALEKLVKERLEGKQNLNCI 


5523 


3 


1280 


SKGKKRMGSSMSAATARRPVFDDKEDVNFDHFQILRAIGKGSFG 
KVC I VQKRDTEKMYAMKYMNKQQC I ERDE VRNVFRELE I LQE IE 
HVFLVNLWYSFQDEEDMFMWDLLLGGDLRYHLQQNVQFSEDTV 
RLYICEMALALDYLRGQHIIHRDVKPDNILLDERGHAHLTDFNI 
ATI IKDGERATALSGTKPYMAPEI FHSFVNGGTGYSFEVDWWSV 
GVMAYELLRGWRP YDIHS SNAVESLVQIiFSTVS VQYVPTWS KEM 
VALLRKLLTVNP EHRL S S LQDVQAAPALAG VL WDHLSE KR VE PG 
FV PNKGRLHCD P TFE LEEM I LESRP LH KKKKRLAKNKSRDNS RD 
S SQS ENDYLQDCLDAI QQDFVI FNREKLKRS QDLPREPLPAPES 
RDAAEPVEDEAERS ALPMCG P I CP S AGSG 


5524 


85 


2318 


RERERDHRPGESSQGQSGAGGCFPSPTMELRCGGLLFSSRFDSG 
NLAHVEKVESLSSDGEGVGGGASALTSGIASSPDYEFNVWTRPD 
CAETE FENGNRS W FYFS VRGGMPGKL I KINIMNMNKQSKLYSQG 
MAPFVRTLPTRPRWER I RDRPTFEMTETQFVLS FVHRFVEGRGA 
TTFFAFCYPFSYSDCQEUjNQLDQRFPENHPTHSSPLDTlYYHR 
ELLCYSLDGLRVDLLTITSCHGLREDREPRLEQLFPDTSTPRPF 
Kr AG KK 1 r FliS S RVHPGETP S S FV FNG FLD F 1 1*R PDDPRAQTLR 
RLFVFKLI PMI^PDGVVRGHYRTDSRGVNLNRQYLKPDAVLHPA 
I YGAKAVLLYHHVHSRLNSQ S SSEHQPSS CLP PDAPVSDLEKAN 
NLQNEAQCX^SADRHNAEAWKQTEPAEQKLNSVWrMPQQSAGLE 
ESAPDTI PPKESGVAYYVDLHGHASKRGCFMYGNS FSDESTQVE 
NMLYPKLISI^SAHFDFO/JCNFSEKNMYARDRRDGQSKEGSGRV 

Al IKAool lHi> 1 1 11C.LJN iW l\»Kb VNS1 K/VMJHJJlYGKASPPPPPA 

FPSRYTVELFEQVGRAMAIAALDMAECNPWPRIVLSEHSSLTNL 
RAWMLKHVl^SRGLSSTLNVGVNKKRGLRTPPKSHNGLPVSCSE 
NTLSRARSFSTGTSAGGSSSSQQNSPQMKNSPSFPFHGSRPAGL 
PGLG S S TQ KVTHR VLG P VRGKP VWEPLQHVFGCIjGHCWGK 


5525 


105 


834 


SNTJjDFERHLFIMGQQISDQTQLVINKLPEKVAKHVTLVRESGS 
LTYEE F LGR VAELNDVTAKVASGQEKHLL FEVQ PG S DS SAFWKV 
VVRWCTKJNKSSGIVEASRIMNLYQFIQLYKDITSQAAGVLAQ 
S STS EEPDENSSS VTSCQAS LWMGRVKQLTDEEBCCI CMDGRAD 
L I LP CAHS FCQKC I DKWSDRHRNCPI CRLQMTGANES WWS DAP 
TEDDMANYILNMADEAGQPHRP 


5526 


3 


853 


RRPCNPVl^AAKRTGAAARAPRGLEVTMLRVAWRTLSLIRTRAVT 
QVLVPGLPGGGSAKFPFNQWGLQPRSLLLQAARGYWRKPAQSR 
LDDDP P PS TLLKDYQNVPG I EKVDDWKRLLS LEMANKKKMLKI 
KQ EQ FM KK I VANP EDTRSLEAR 1 1 ALS VKI RS YEE HLE KHRKD K 
AHKRYLLMSIDQRKKMLKNLRNTNYDVFEKICWGLGIEYTFPPL 
Y YRRAHRRFVTKKALC I RVFQETQKLKKRRRALKAAAAAQKQAK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine / 
H^Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, TVThreonine, V=Valine, 
W= Tryptophan, Y-Tyrosine, X=Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








RRNPDSPAKAI PKTLKDSQ 


5527 


3225 


565 


LLRKYLLHQNPLLLRHQPNRTCISFSATMKLKDTKSRPKQSSCG 
KFQTKGIKWGKWKEVKIDPNMFADGQMDDLVCFEELTDYQLVS 
PAKNPSSLFS KEAPKRKAQAVS EEEEEEEGKSSSPKKKI KLKKS 
KNVATEGTS TQ KEFE VKD PE LEAQGDDMVCDDPEAGEMTS ENL V 
QTAPKKKKNKGKKGLEPSQSTAAKVPKKAKTWI PEVHDQKADVS 
AWKDLFVPRPVLRALSFLGFSAPTPIQALTLAPAIRDKLDILGA 
AETG SGKTLAFAI P M I HA VLQ WQ KRNAAP P PSNTEA P PGETRTE 
AG AETRS PG KAE AE S DAL PDDTV I E S EAL PSD I AAE ARAKTGGT 
VS DQ ALLFGDD D AG EG P S S L I RE KP VPKQNENEEENLDKEQTGN 
LKQELDDKSATCKAYPKRPLLGLVLTPTRELAVQVKQHIDAVAR 
FTG I KTAILVGGMS TQKQQRMLNRRPE I WATPGRLWELI KEKH 
YHLRNLRQLRCLWDEADRMVE KGHFAELS QLLEMLNDSQYNPK 
RQTLVFSATLTLVHQAPARILHKKHTKKMDKTAKLDLLMQKIGM 
RG KP KVIDLTRNEATVETLTETK I HCETDE KDFYL YYFLMQ Y PG 
RS LVFANS I S CI KRLSGLLKVLD IMP LTLHACMHQKQRLRNLEQ 
FAR L EDCVLLATDVAARG LDI PKVQ HV I H YQ VPRTS E I YVHRSG 
RTARArNBGLSLMLIGPEDVINFKKIYKTLKXDEDIPLFPVQTK 
YMD WKER I RLARQ I E KS E YRNFQACLHNS W I EQAAAALE I ELE 
EDMYKGGKADQQEERRRQKQMKVLKKELRHLLSQPLFTESQKTK 
YPTQSGKPPLLVSAPSKSESALSCLSKQKKKKTKKPKEPQPEQP 
QPSTSAN 


5528 


3 


895 


G P FLS ACRMWGACKVKVHDS LAT I S I TLRRYLRLGATMAKS KFE 
YVRDFEADDTCLAHCWVWRLDGRNFHRFAEKHNFAKPNDSRAL 
QLMTKCAQTVMEELED I VI AYGQSDE YS FVFKRKTNWFKRRAS K 
FMTHVASQFASSYVFYWRDYFEDQPLLYPPGFDGRWVYPSNQT 
LKDYLSWRQADCHINNLYNTVFWALIQQSGLTPVQAQGRLQGTL 
AADKNE ILFS EFNINYNNE PPM YRKGTVL I WQKVDEVMTKE I KL 
PTEMEGKKMAVTRTRTKPCKPSHLPRAPCLRWIj 


5529 


48 


640 


TFRLVSAHLKTRKIjINPEAAERRWRDWDSRQGWLSVKMQRVSGL 
LSWTLSRVLWLSGLSEPGAARQPRIMEEKALEVYDLIRTIRDPE 
KPNTLEELEWSESCVEVQEINEEEYLVIIRFTPTVPHCSLATL 
IGLCLRVKLQRCLPFKHKLEIYISEGTHSTEEDINKQINDKERV 
AAAMENPNLREI VEQCVLEPD 


5530 


4541 


2606 


AQIVHAISYCHKLHVGHRDLKPENWFFEKQGLVKLTDFGFSNK 
FQPGKKLTTSCGSLAYSAPEILLGDEYDAPAVDIWSLGVILFML 
VCGQP P FQEANDS ETLTM I MDCK YTVPS HVS KECKDL I TRMLQR 
DPKRRASLEEIENHPWLQGVDPSPATKYNIPLVSYKNLSEEEHN 
S 1 1 QRMVLGD I ADRDA I VEALETNR YNH ITAT YFLLAER I LREK 
QEKEIQTRSASPSNIKAQFRQSWPTKIDVPQDLEDDLTATPLSH 
ATVPQS PARAADS VLNGHRS KGLCDSAKKDDLPELAGPALSTVP 
PASLKPTASGRKCLFRVEEDEEEDEEDKKPMSLSTQWIiRRKPS 
VTNRLTSRKSAPVLNQIFEEGESDDEFDMDENLPPKLSRIjKMNI 
ASPGTVHKRYHRRKSQGRGSSCSSSETSDDDSESRRRLDKDSGF 
TYS WHRRDSSEG P PGSEGDGGGQS KPSNASGGVDKAS PS ENNAG 
GGSPSSGSGGNPTNTSGTTRRCAGPSNSMQLASRSAGELVESLK 
LMS LCLGS QLHG S TKY I IDPQNGLS FSSVKVQEKSTWKMCI SST 
GNAGQVPAVGG I KFFSDHMADTTTELERI KSKNLKNNVLQLPLC 
EKTISVNIQRNPKEGLLCASSPASCCHVI 


5531 


24 


515 


GSQPRAPRPRDSMERPEPELIRQSWRAVSRSPLEHGTVLFARLF 
ALEPDLLPLFQ YNCRQFSS PEDCLS S P E FLDHIRKVMLVI DAAV 
TNVEDLSSLEEYLASLGIIKHRAVGVKLSSFSTVGESLLYMIjEKC 
LGPAFTPATRAAWS QL YGAWQAMSRGWDGE 


5532 


3395 


1402 


S DWMWGKRKM 1 1 EDETE FCGEELLHS VLQCK5 VFDVLDGE EMR 
RARTRANPYEMIRGVFFliNRAAMKMANMDFVFDRMFTNPRDSYG 
KPLVKDREAELLYFADVCAGPGGFSEYVLWRKKWHAKGFGMTLK 
GPNDFKLEDFYSASSELFEPYYGEGGIDGDGDITRPENISAFRN 
FVLI)NTDRKGVHFIJ4ADGGFSVEGQENLQEILSKQLLLCQFIjMA 
L S IVRTGGHFICKTFDLFT P FS VGLVYLL YCCFR R VCL FKP I TS 
R PANS ER YWC KG LKVG I D D VRD YL FAVN I KLNQ LRNTDS DVNL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , lULysine, 
L=Leucine, M-Methionine , N-Asparagine, 
P«Proline, Q-Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknovn, *=Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








WPLE Vl KGDHE FTD YM I RSNESHCS IiQIKALAKIHAFVQDTTL 
SBPRQAEIRKECLRLWGIPDQARVAPSSSDPKSKFFELIQGTEI 
D I FS Y KPTLLTS KTLE KI RP VFD YRCMVSGS EQKFL IG LGKSQ I 
YTWDGRQSDRWIKLDLKTELPRDTLLSVEIVHELKGEGKAQRKI 
S A I H I LD VL VLNGTDVRE QHFNQR I QUAE KFVKAVS KPS RPDMN 
P I RVKE VYRLEEMEKI FVRLBMK I I KGSSGTPKLS YTGRDDRHF 
VPMGLYIVRTVNEPWTMGFSKSFKKKFFYNKKTKDSTFDLPADS 
IAPFHICYYGRLFWEWGDGIRVHDSQKPQDQDKLSKEDVLSFIQ 
MHRA 


5533 


94 


789 


MKE RRAPQP WARC KLVLVGDVQCGKTAMLQVLAKDCY PETYVP 
TVFENYTACLETEEQRVELSLWDTSGSPYYDNVRPLCYSDSDAV 
LLCFDISRPETVDSALKKWRTEILDYCPSTRVLLIGCKTDLRTD 
LSTLMELSHQKQAPISYEQGCAIAKQLGPEIYLEGSAFTSEKSI 
HS 1 FRTASMLCLNKPS PLPQKS P VRS LS KRLLHLPSRSELI S PT 
FKKE KAKXCS I M 


S534 


3 


605 


LVRGRARAANPGRVGAMIX3LRQRVEHFLEQRNLVTEVLGALEAK 
TGVE KRYLAAG AVTLLS L YLLFG YGAS L LCN L I GFVYP AYAS I K 
AI ES PS KDDDTVWLTYWWY ALFGLAEFFSDLLLS W FP FY YVGK 
CAFLLFCMAPRPWNGALMLYQRWRPLFLRHHGAVDRIMNDLSG 
RALDAAAGITRNVXPSQTPQP KDK 


5535 


1029 


332 


KSFMDSEARLCSLVELSDTQDETQKSDSENEDLKIDCLQESQEL 
NLQKLKNSERILTEAKQKMRELTVNIKMKEDLIKELIKTGNDAK 
SVSKQYTLKVTKLEHDAEQAKVELTETQKQLQELEKKDLSDVAM 
KVKLQKEFRXKVDAAXLRVQVI^KKQQDSKKIASLSIQNEKRAN 
EUEQSVTJHMKYQKIQI^RKLQEENEKRKQLDAVIKRDQQKIKVI 
LSYI PAKYNMKC 


5536 


942 


282 


AAATAASLSPRGCRLRTPSSDVSPSRAPPPSAAPLPTGRAQMSP 
SGRLCLLTI VGLI LPTRGQTLKDTTS S S S ADAT IMD I QVPTRAP 
D AVYTE LQ PTS P T P TWP ADETPQP QTQTQQLEGTDG PL VTDP ET 
HKST KAAHPTDDTTTLSER PS PSTD VQTD PQTLKP S G FHEDD P F 
F YDEHTLRKRGLLVAAVLF ITG 1 1 1 LT.SGKCRQLSRLCRNHCR 


5537 


3 

* 


2391 


RARVS S PQLRVFRSGRPRRLRVLR INRTS VALRLAGTGRF VAKT 
PGHPGSWEMGLLTFRDVAVEFSLEEWEHLEPAQKNLYQDVMLEN 
YRNLVSLGLWSKPDLITFLEQRKEPWNVKSEETVAIQPDVFSH 
YNKDLLTEHCTE AS FQKVI SRRHGS CDLENLHLRKRWKREECEG 
HNGCYDEKTFKYDQFDESS VESLFHQQ I LSSCAKS YNFDQYRKV 
FTHS SLLNQQEE I D I WGKHH I YDKTS VLFRQVSTLNS YRNVFIG 
EKNYHCNNSEKTLNQSSSPKNHQENYFLEKQYKCKEFEEVFLQS 
MHGQE KQEQ S YKCNK CVE VCTQS LKH I QHQT IHI RENS YS YNK Y 
DKDLSQS SNLRKQ 1 1 HNEE KP YKCEKCGDS LNHSLHLTQHQI I P 
TEEKP Y KWKECG KVFNLNCS L YLTKQQQ I DTGENL Y KCKACS KS 
FTRSS^IVHQRIHTGEKPYKCKECGKAFRCSSYLTKHKRIHTG 
EKPYKCKECGKAFNRSSCLTQHQTTHTGEKLYKCKVCSKSYARS 
SNL I MHQR VHTGE KP Y KC KECGKVF S RS S CLTQHRK I HTGENL Y 
KCKVCAKPFTCFSNLIVHERIHTGEKPYKCKECGKAFPYSSHLI 
RHHRI HTG E KP YKC KACS KS FSDS SG LTVHRRTHTGE K P YTCKE 
CGKAF S Y S S DVI QHRR I HTGQR P YKC EE CG KAFN YRS YLTTHQR 
SHTGERPYKCEECGKAFNSRSYLTTHRRRHTGERPYKCDECGKA 
FS YRS YLTTHRRSHS GERP YKCEECGKAFNSRS YL I AHQRSHTR 
EKL 


5538 


926 


161 


HSKMMKIPWGSIPVLMLLLLLGLIDISQAQLSCTGPPAIPGIPG 
IPGTPGPDGQPGTPGIKGEKGLPGLAGDHGEFGEKGDPGIPGNP 
GKVGPKGPMGPKGGPGAPGAPGPKGESGDYKATQKIAFSATRTI 
NVPLRRDQTIRFDHVI TNMNNNYEPRSGKFTCKVPGLYYFTYHA 
S SRGNLCVNLMRGRERAQKWTFCD YAYNT FQVTTGGMVLKLEQ 
GENVFLQATDKNSLLGMEGANS I FSGFLLFPDMEA 


5539 


38 


1258 


HRGPSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPG 
IVDGPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREK 
DE I YGHPLF PLLAL VFE KCELATCS PRDGAG AGLGTP PGGDVCS 
SDSFNEDIAAFAKQVRSERPLFSSNPELDNLVIQAIQVLRFHLL 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

c o rr e sp ondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine , G=Glycine, 
H=Hietidine, I-Isoleucine , K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glutamine, R*Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ELEKVHDLCDNFCHRYI TCLKGKMP I DLVI BDRDGGCREDFEDY 
PASCPSLPDQNNMWIRDHEDSGSVHLGTPGPSSGGLASQSGDNS 
SDQGDGLDTSVASPSSGGEDEDLDQERRRNKKRGIFPKVATNIM 
RAWLFQHLSHPYPSEEQKKQLAQDTGLTILQVNNWFINARRRIV 
QPMIDQSNRTGQGAAFSPEGQPIGGYTETQPHVAVRPPGSVGMS 
LNLEGEWHYL 


5540 


148 


1440 


PPLGAGAGVHARSPHPARRLPLTTAGVGGRAPDLLPTPWRQHRG 
PSGAAAPGCALPRGQALBGPRSCRRPQPMARRYDELPHYPGIVD 
GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREKDEI 
YGH P L F P LLALVF E KCELATCS PRDGAGAGLGTPPGGD VCS S DS 
FNEDNTAFAKQVRSERPLFSSNPELDNLMIQA1QVLRFHLLELE 
KGKMPIDLVIEDRDGGCREDFEDYPASCPSLPDQNNIWIRDHED 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGED 
EDLDQEPRRNKKRGIFPKVATNIMRAWLFQHLSHPYPSEEQKKQ 
LAQDTGLT I LQVNNWFINARRRI VQPMI DQSNRTGQGAAFS PEG 
QPIGGYTETEPHVAFRAPASVGDEFGTRKEEWHYL 


5541 


148 


1440 


PP LGAGAGVHARS PHPARRLP LTTAGVGGRAPDLLPTPWRQHRG 
PSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPGIVD 
GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREKDE I 
YGHPLFPLLALVFEKCELATCSPRDGAGAGLGTPPGGDVCSSDS 
FNEDNTAF AKQVRS E RPLFS S NPELDNLM IQAI QVLR FHLLELE 
KG KMP I DLV I EDRDG G CREDFED Y PAS CP S LPDQNN I WI RDHED 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGED 
EDLDQE PRRNKKRG I FPKVATN IMRAWLFQHLSHP YPSEEQKKQ 
LAQDTGLT I LQVNNWF I NARRR I VQPMI DQSNRTGQGAAFS PEG 
QPIGGYTETEPHVAFRAPASVGDEFGTRKEEWHYL 


5542 


148 


1440 


P P LGAGAGVHARS PHPARRLP LTTAGVGGRAPDLLP T P WRQHRG 
PS GAAAPG CALPRGQ ALEG PRS CRRPQ PMARR YDE L PHYPG I VD 
GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREKDBI 
YGHPLFPLLALVFEKCELATCSPRDGAGAGLGTPPGGDVCSSDS 
FNEDNTAFAKQVRSERPLFSSNPELDNLMIQAIQVLRFHLLELE 
KGKMP IDLVI BDRDGGCREDFEDY PAS CPS LPDQNNI W IRDHED 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGED 
EDLDQEPRRNKKRGI F PKVATN IMRAWLFQHLSHP YPSEEQKKQ 
LAQDTGLT I LQVNNWFINARRRI VQPMI DQSNRTGQGAAFS PEG 
QP I GG YTETEPH VAFRAPAS VGDE FGTR KEEWHYL 


5543 


2405 


665 


RWVRE QPW PLRTS EAVKT P ALR P FPG P RG VS P F PKPD WG KS PAP 
KR P FSDSGAFWS PERRPG VLEAPRRRPVPAS FRAVPPKPTR VHG 
S S AS RDR VLAR TMI VADS ECRAE L KD YLRFAPGG VGD S G PGEEQ 
R ES RARRG PRG P S AF I P VEE VLREGAE S LEQHLG LEALM S SGRV 
DNLAWMGLH PD YFTS FWRLHYLLLHTDGPLAS S WRHY I AIMAA 
ARHQCSYLVGSHMAEFLQTGGDPEWLLGLHRAPEKLRKLSEINK 
LLAHR PWLITKEHIQALLKTGEHTWSLAELI QALVLLTHCHSLS 
SFVFGCGILPEGDADGSPAPQAPTPPSEQSSPPSRDPLNNSGGF 
ESARDVEALMERMQQLQESLLRDEGTSQEEMESRFELEKSESLL 
VTPSADILEPSPHPDMLCFVEDPTFGYEDFTRRGAQAPPTFRAQ 
DYTWEDHGYSLIQRLYPEGGQLLDEKFQAAYSLTYNTIAMHSGV 
DTS VLRRAIWNY IHCVFGI RYDDYDYGEVNQLLERNLKVY I KTV 
ACYPEKTTRRMYNLFWRHFRHSEKVHVNLLLLEARMQAALLYAL 
RAITRYMT 


5544 


18 95 


514 


LGGLLGR QRLLLRMGAGR LGAPMERHG RAS ATS VS S AGEQAAGD 
PEGRRQE PLRRRAS S AS VPAVGAS AEGTRRDRLGS YSGPTS VSR 
QR VESLRKKRPL FPWFGLD IGGTLVKLVYFEPKD I TAEEEEEEV 
E S LKS I RKYLTS NVAYG S TG I RD VHLE L KDLTL CG RKGNLH FIR 
F PTHDM P AF IQMGRD KNF S S LHTVFCATGGGAYK F E QDFLT I GD 
LQLCKLDELDCLIKGILYIDSVGFNGRSQCYYFENPADSEKCQK 
LPFDLKNPYPLLLVNIGSGVSILAVYSKDNYKRVTGTSLGGGTF 
FGLCCLLTGCTT FEEALEMASRGDSTKVDKLVRDI YGGDYERFG 
LPGWAVASSFGNMMSKEKREAVS KEDLARATLITITNNIGS IAR 
MCALNE N I NQ WF VGNFLR INT I AMRLLA YALD YWS KGQLKALF 
SEHEGYFGAVGALLELLKIP 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G-Glycine, 
H-Histidine, I-Isoleucine , K-Lysine, 
L-Leucine, M-Methionine, N«Asparagine, 
P-Proline, Q^Glutamine, R=Arginine, 
S^Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5545 


802 


131 


G AMWS AGRGGAAW P VLLG LLLALL VPGGGAAKTGAELVTCGS VL 
KLLNTHHRVRLHSHDIKYGSGSGQQSVTGVEASDDANSYWRIRG 
GSEGGCPRGSPVRCGQAVRLTHVLTGKNLHTHHFPSPLSNNQEV 
S AFGE DGEGDD LDLWTVRCS GQHWEREAAVRFQHVGTS VFLSVT 
GEQYGSP IRGQHEVHGM PS ANTHNTW KAMEG I F I KPS VE PSAGH 
DEL 


5546 " 


1592 


146 


FVPRGGHSSMGQSGRSRHQKRARAQAQLRNLEAYAANPHSFVFT 
RGCTGRNIRQLSLDVRRVMEPLTASRLQVRKKNSLKDCVAVAGP 
LGVTHF L I LS KT ETNVY F KLMRL PGGPTLTFQ VKKYS LVRD WS 
SLRRHRMHEQQFAHPPIjLVLNSFGPHGMHVKLMATMFQNLFPSI 
NVHKVNLNT I KRCLL IDYNPDSQELDFRHYSI KWP VGASRGMK 
KLLQE KFPNMS RLQDI S EliLATGAGLSESEAEPDGDHNITELPQ 
AVAGRGNMRAOQSAVRLTEIGPRMTLQLIKVQEGVGEGKVMFHS 
FVS KTEEELQAI LEAKE KKLRLKAQRQAQQAQNVQRKQEQREAH 
RKKSLEGMKKARVGGSDEEASGIPSRTASLELGEDDDEQEDDDI 
E Y FCQ AVGE AP S ED LF P E AKQ KRLAKS PGRKRKRWEMDRGRGRL 
CDQKFP KTKD KS QG AQARRG P RGAS RDGGRGRGRGR PGKR VA 


5547 


1592 


146 


FVPRGGHSS MGQSGRSRHQKRARAQAQLRNLEAYAAN PHS FVFT 
RGCTGRNIRQLS LD VRRVME PLTASRLQVRKKNS LKDCVAVAGP 
LGVTHFLILSKTETNVYFKLMRLPGGPTLTFQVKKYSLVRDWS 
S LRRHRMHE QQ F AHP P LLVLNS FG PHGMHVKLMATMFQNL F P S I 
NVHKVNLNTIKRCLL1DYNPDSQELDFRHYSIKWPVGASRGMK 
KLLQEKFPNMSRLQDISELLATOAGLSESEAEPDGDHNITELPQ 
A VAGRGNMRAQQSAVRLTE I GPRMTLQLIKVQEGVGEGKVMFHS 
FVSKTEEELQAI LEAKE KKLRLKAQRQAQQAQNVQRKQEQREAH 
RKKSLEGMKKARVGGSDEEASGIPSRTASLELGEDDDEQEDDDI 
EYFCQAVGEAP SEDLFPEAKQ KRLAKS PGRKRKRWEMDRGRGRL 
CDQKFP KTKD KSQGAQARRGPRGASRDGGRGRGRGRPGKRVA 


5548 


1 


2153 


DQ TG P PETIAFT FPRSTMEPL CP LLLVG FSLP LARALRGNE TTA 
DSNETTTTSGP PDPG AS Q P LLAWLLL PLLLLLLVLLLAAY F F RF 
RKQRKAWSTSDKKMPNGILEEQEQQRVMLLSRSPSGPKKYFPI 
PVEHLEEEIRIRSADDCKQFREEFNSLPSGHIQGTFELANKEEN 
RE KNR YPNI LPNDHSR VI LS QLDG I P CSDYINAS Y1DC YKEKNK 
FIAAQGPKQETVNDFWRMVWEQKSATIVMLTNLKERKEEKCHQY 
WPE>O^CWTYGNIRVCVEDCVVLVDYTIRKFCIQPQLPDGCKAPR 
LVSQLHFTSWPDFGVPFTP IGMLKFLKKVKTLNPVHAGP I WHC 
SAGVGRTGTFIVIDAMMAMMHAEQKVDVFEFVSRIRNQRPQMVQ 
TDMQYTFIYQALLEYYLYGDTELDVSSLEKHLQTMHGTTTHFDK 
IGLEEEFRKLTl^IMKENMRTGNLPANMKKARVIQI IP YDFNR 
VILSMKRGQEYTDYINAS F I DG YRQ KD Y F I ATQG PLAHT V EDF W 
RM I WEW KSHT I VM LTEVQE REQDKC YQ YWPT EG S VTHGE I T I E I 
KNDTLSEAISIRDFLVTLNQPQARQEEQVRWRQFHFHGWPEIG 
I PAEGKGMIDLIAAVQKQQQQTGNHP ITVHCSAGAGRTGTF IAL 
SNI LERVKAEGLLDVFQAVKS LRLQRPHMVQTLEQ YEFCYKWQ 
DFIDIFSDYANFK 


5549 


915 


256 


FEATGGKRLAFKMAGTARHDREMAIQAKKKLTTATDPIERLRLQ 
CLARGS AGI KGLGRVFR I MDDDNNRTLDFKEFMKGLNDYAWME 
KEEVEELFQRFDKDGNGT I DFNEFLLTLRPPMSRARKEVIMQAF 
RKLDKTGDGV I T I E DLRE VYNAKHHP KYQNGE WS EEQ VFRKFLD 

vtct^ C DV"n VTV1T \7T , DI?T7t?MXrVvar2\7C2iQTriTri\7VPT TMMPTAWVT. 


5550 


2364 


1210 


RKRKVFLKMRRLNRKKTLS LVKELDAFPKVPES YVETSASGGTV 
SLIAFTTMALLTIMEFSVYQDTWMKYEYEVDKDFSSKLRINIDI 
TVAMKCQ Y VGADVLDLAETMVAS ADGL VYEPTVFDLS PQQKEWQ 
RMLQL I QSRLQEEHSLQDVI FKS AFKS TS TALP PREDDSSQS PN 
ACR IHGHL YVNKVAGKFH I TVG KAIPH PRGHAHLAALVNHES YN 
FSHRIDHLSFGELVPAIINPLDGTEK1AIDHNQMFQYFITWPT 
KLHTY K I S ADTHQ F S VTERE R 1 1 NHAAG S HGVSG I FMKYDLS S L 
M VTVT E EHMPFWQ FFVRLCG I VGG I FS TTGMLHG I GKFI VE 1 1 C 
CRFRLGSYKPVNSVPFEDGHTDNHLPLLEKNTH 


5551 


211 


1700 


MQRDHTMDYKESCPSVSIPSSDEHREKKKRFTVYKVLVSVGRSE 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine , G*Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=»Leucine, M=Methionine, N*Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W-Tryptophan, Y-Tyrosine, X=Unknovn, *=rStop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 








W FVFRR YAE FD KL YNTI _. K KO F P &MA T .V T D a V"D T mauri ran c t v 

QRRAGIiNEFIQNLVRyPELYNHPDVRAFLQMDSPKHQSDPSEDE 
DERSSQKLHSTSQNINLGPSGNPHAKPTDFDFLKVIGKGSFGKV 
LIAKRKLDGKFYAVKVLQKKIVLNRKEQKHIMAERNVLLKIfVKH 
PFLVGLHYSFQTTEKLyFVLDFVNGGELFFHLQRERSFPEHRAR 
FYAAEIASAU3YLHSIKIVYRDLKPEWILLDSVGHWLTDFGLC 
KEGIAISDTTTTFCGTPEYLAPEVIRKQPYDNTVDWWCLGAVLY 
EMLYGLPPFYCRDVASMYDNILHKPLSLRPGVSLTAWSILEELL 
EKDRQNRLGAKEDFLEIQNHPFFESLSWADLVQKKIPPPFNPNV 
AGPDDI RNFDTAFTEETVPYSVCVSSDYS I VNAS VLEADDAFVG 
FSYAPPSEDLFL 


5552 


2748 


930 


LGPAAGAAMGKKHKKHKAEWRSSYEDYADKPLEKPLJCLVLKVGG 
SEVTELSGSGHDSSYYDDRSDHERERHKEKKKKKKKKSEKEKHL 
DDEERRKRKEEKKRKREREHCDTEGEADDFDPGKKVEVEPPPDR 
p v kauk 1 u PAciN h S TP I QQLLEH FLRQLQRKDPHG FFAFPVTDA 
I APGYSMI I KHPMDFGTMKDKI VANEYKS VTEFKADFKLMCDNA 
MTYNRPDTVYYKLAKKI LHAGFKMMSKQAALLGNEDTAVEEPVP 
EWPVQVETAKKSKKPSREVISCMFEPEGNACSLTDSTAEEHVL 
ALVEHAADEARDRINRFLPGGKMGYLKRNGDGSLLYSWNTAEP 
DADEEETHPVDLSSLSSKLLPGFTTLGFKDERRNKVTFLSSATT 
ALSMQNNSVFGDLKSDEMELLYSAYGDETGVQCALSLQEFVKDA 
GS YS KKWDDLLDQI TGGDHSRTLFQLKQRRNVPMKPPDEAKVG 
f a iwuooao viicrwan^s I PD Vfc> VJJlfainijboLwKVKKELDPDDS 
HLNLD ETT KL LQDLHEAQAERGGS R PSSNLS S LSNASERDQHHL 
GS PS RLS VGEQ PD VTHDP YEFLQS P EPAAS AKT 


5553 


74 


1095 


LGREAVYLVSRMDGPVAEHAKQEPFHWTPLLESWALSQVAGMP " 
VFLKCENVQPSGSFKIRGIGHFCQEMAKKGCRHLVCSSGGNAGI 
AAAYAARKLG I PATI VLPESTSLQ WQRLQGEGAE VQLTGKVWD 
EANLRAQELAKRDGWENVP P FDH PLI WKGHASLVQELKAVLRT P 
PGALVLAVGGGGLLAG WAGLLEVGWQHVP 1 1 AMETHGAHCFNA 
A I TAG KL VTL PD I TS VAKS LG AKTVAARAL E CM Q VC KI HS E WE 
DTEAVSAVQQLLDDERMLVEPACGAALAAIYSGLLRRLQAEGCL 
PPS LTSVWI VCGGNNI NS RELQALKTHLGQV 


5554 


166 


2318 


CSGRTGGRGSLRPAENVCLTCKIjSGAETRGLLCPALRTWIMKVL 
GRS FFWVLFPVLP WAVQAVEHEEVAQRVI KLHRGRGVAAMQS RQ 
WVRDS CRKLSGLIjRQKNA VLNKL KTA I GAVEKDVGLSDE E KL FQ 
VHTFEIFQKELNESENSVFQAVYGLQRALQGDYKDWNMKESSR 

QRLEALREAAI keeteymellaaekhqvealknmqhqnqslsml 

DE ILEDVRKAADRLEEB IEEHAFDDNKSVKGVNFEAVLRVEEEE 

HFIKDIVTIGMLSLPCGWLCTAIGLPTMFGYIICGVLLGPSGLN 
S IKS I VQ VETLGE FG VFFTLFL VG LE FS P EKLRKVW KI S LQG P C 
YMTLLM I AFGLLWGHLLR I KPTQS V F I STCLS LS S TPLVS R FLM 
GSARGDKEGD I DYS TVLLGMLVTQDVQLGLFMAVMP TL IQAG AS 
ASSS I WEVLRI LVLIGQILFSLAAVFLLCLVI KKYLIGPYYRK 
LHMES KGNKE I L ILG ISAFI FLMLTVTELLDVSMELGCFLAGAL 
VSSQG PWTEE I ATS IEP I RDFLAI VFFAS IGLHVFPTFVAYEL 
TVLVFLTLSVVVMKFLLAALVLSLILPRSSQYIKWIVSAGLAQV 
S EFS FVLG S RARRAG VI S RE VYLL ILS VTTLS LLLAP VLWRAAI 
TRCVPRPERRSSL 


5555 


212 


1425 


LSLRTRETPAPPRCEAASQGRVGWRADAAAEEAVRSVWNRTRDR 
GTiMAPQNLSTFCLLLLYLIGAVIAGRDFYKILGVPRSASIKDlK 
KAYRKIAliQLHPDRNPDDPQAQEKFQDLGAAYEVLSDSEKRKQY 
DTYGEEGLKDGHQSSHGDIFSHFFGDFGFMFGGTPROQDRNIPR 
GSD 1 1 VDLEVTLEE VYAGNFVEWRNKPVARQAPGKRKCNCRQE 
MRTTX2I/5PGRFQMTQEVGCDECPNT/KLVNEERTLEVEIEPGVRD 
GME YP F I GEGEPKVDGEPGDLRFRI KWKHP I FERRGDDLYTNV 
TISLVESLVGFEMDITTOJXJHKVHrSRDKITRPGAKLWKKGEGL 
PNFDNNNIKGSLIITFDVDFPKEQLTEEAREGIKQLLKQGSVQK 
VYNGLQGY 


5556 


5835 


3346 


RTRGMSKNCVPMEFEEYLLRMFQGTFTLU3KITKDNNAHTVKSR 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G-Glycine, 
H=Histidine, I=Isoleucine, lULysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q-Glutamine, R-Arginine, 
S -Serine, T-Threonine, V*Valine, 
W-Tryptophan, Y=Tyrosine, X»Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEEIiDESYIEKFTDFLRLPVSVHLRRIESYSQFPWEFLTLLFK 
YTFHQPTHEG YFSCLD I WTLFLD YLTSKI KS RLGDKEAVLNRYE 
DALVLLLTEVLNRI Q FRYNQAQLE ELDDETLDDDQQTEWQRYLR 
QS LEWAKVMELLPTHAFS TLFP VLQDNLE VYLG LQQ FI VTSG S 
GHRLN I TAENDCRRLHCS LRDLS S LLQAVGR LAEYF I GD VFAAR 
FNDALTWERLVKVTL YGSQ I KLYNI ETAVPS VLKPDL IDVHAQ 
SLAALQAYSHWLAQYCSEVHRQNTQQFVTLISTTMDAITPLIST 

VA/nnirT T T C 7\ nJT T trOT R r PT , \rD PV TTT T C TP AVTnifVWMP TTHRCl 

LRLVDKAQVLVCRALSNILLLPWPNLPENEQQWPVRSINHASLI 
SALSRDYRNLKPSAVAPQRKMPLDDTKL I IHQTLSVLEDI VENI 
SGESTKSRQICYQSLQESVOVSLALFPAFIHQSDVTDEMLSFFL 
TLFRGLRVQMGVPFTEQI IQTFLNMFTREQLAES ILHEGSTGCR 
WEKFLKI LQWVQEPGQVFKP FLPSI IALCMEQVY P I IAERPS 
PDVKAELFELLFRTLHHNWRYFFKSTVLASVQRGIAEEQMENEP 
Q FS A I MQAFGQ S FLQ P D I HLF KQN LF YLETLNTKQKL YHKKI FR 
TAMLFQFVNVLLQVLVHKSHDLLQEEIGIAIYNMASVDFDGFFA 
AFLPEFLTS CDGVDANQKS VLGRNFKMDRVRRERGRAKRRAEWA 


5557 


1712 


491 


VI LGAGLRDKDMW I PWGLPRRLRLS ALAGAGRFCILGS EAATR 
KHLPARNHCGLSDSSPQLWPEPDFRNPPRKASKASLDFKRYVTD 
RRLAETLAQ I YLGKPSRPPHLLLECNPGPG I LTQALLEAGAKW 
ALESDKTFIPHLESLGKNLDGKLRVIHCDFFKLDPRSGGVIKPP 
AMSSRGLFKNLGIEAVPWTADIPIiKWGMFPSRGEKRALWKLAY 
DLYSCTSIYKFGRIEVNMFIGEKEFQKLMADPGNPDLYHVLSVI 
WQLACE IKVLHMEPWS S FDI YTRKGPLENP KRRELLDQLQQKLY 
LIQMIPRQNLFTKNLTPMNYNIFFHLLKHCFGRRSATVIDHLRS 
LTPLDARDILMQIGKQEDEKVVNMHPQDFKTLFETTERSKDCAY 
KWLYDETLEDR 


5558 


1509 


96 


rC/iLrL 1 rih\jv irAUijijMr'AC. rK-KrlJiV. 1 V- Vl_J_iijy rvjyf^rctjjr 1 1 rl J. 

TGVFS MRLWT P VGVLTS LAYCLHQRRVALAELQEADGQC PVDRS 
LLKL KMVQ W F RHGARS P LKP L P LE EQ VE WN PQ LLEVP PQTQ FD 
YTVTNLAGGPKPYSPYDSQYHETTLKGGMFAGQLTKVGMQQMFA 

lgerlrknyvedipfls ptfnpqevfirstni frnles trclla 
glfqcqkegpi i ihtdeadsevlypnyqscwslrqrtrgrrqta 
slqpg i sedlkkvkdrmgi dssdkvdffilldnvaaeqahnlps 
cpmlkrfarnieqravdtslyilpkedreslqmavgpflhil.es 
nllkamds atapdkirklylyaahdvtfi pllmtlg i fdhkwp p 
favdltmelyqhleskewfvqlyyhgkeqvprgcpdglcpldmf 
lnams vytls pekyhalcsqtqvme vgnee 


5559 


150 


1983 


plaatahfakmsrvakyrrqvsedpdidslletlspeemeelek 
eldwdpdgsvpvglrqrnqtbkqstgvynreamlnfceketkk 
lmqremsmdeskqvetktdakngeergrdaskkalgprrdsdlig 

KKPKRfiRTiKK^F^RDRDEAfiGKSGEKPKEEKIIRGIDKGRVRAA 

vdkkeagkdgrgeeravatkkeeekkgsdrntglsrdkdkkree 
mke vakkedde kvkgerrntdtrkeg e kmkraggntdmkkedek 
vkrgtgntdtkkddekvkkneplhekeakdds ktkt pekqtpsg 
ptkpsegpakveeeiaapsifdeplervknndpemtevnvnnsdc 
ITNE I lvrftealefntwklfalantraddhvafaiaimlkan 

KT I TSLNLDSNH I TGKG I LAI FRALLQNNTLTE LRFHNQRH I CG 
GKT£MEIAKLLKE>TI^IJ j KXjGYHFEIjAGPRMTV^ 

rqkrlqeqrqaqeakgekkdllevpkagavakgspkpspqpspk 
pspkns pkkggapaapp ppp pplap plimenlkns lspatqrkm 
gdkvlpaqeknspjdqllaairssklkqlkkvevpkllq 


5560 


9 


921 


sswefsalsvsmaclspsqlqkfqqdgflvlegflsaeecvam 
qqrigeivaemdvplhcrtefstqeeeqlraqgstdyflssgdk 
irfffekgvfdekgwlvppeksinkighalhahdpvfksiths 
fkvqtlarslglqmpvwqs MY I fkqph fggevs phqdas fl yt 

EPLGRVLGVW I AVEDATLENGCLWF I PGSHTSGVS RRMVRAPVG 
SAPGTS FLGS E PARDNS LFVPTPVQRGALVL IHGEWHKS KQNL 
SDRSRQAYTFHLMEASGTTWSPENWLQPTAELPFPQLYT 


5551 


2175 


1775 


CYFIFQFFSSPYPGLHPHQTPAPLPNPGLYPPPVSMSPGQPPPQ 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
atr.ino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenyl alanine , G-Glycine, 
H=Histidine, I»Isoleucine, K«Lysine, 
L*Leucine, M-Methionine , N-Asparagine , 
P-Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








QLLAPTYFSAPGVMNFGNPSYPYAPGALPPPPPPHLYPNTQAPS 
QVYGGVTYYNPAQQQVQPKPSPPRRTPQPVTIKPPPPEWSRGS 
S 


5562 


342 


1385 


SSGKNDMAAAGAAGLVRGLKAGVLSQADYLNLVQCETLEDLKLH 
LQSTDYGNFLANEASPLTVSVIDDRLKEKWVTVEFRHMRNHAYEP 
LASFLDFITYSYMIDNVILLITGTLHQRSIAELVPKCHPLGSFE 
QMEAVNIAQTPAELYNAILVDTPLAAFFQDCISBQDLDEMNIEI 
IRNTLYKAYLE S FYKFCTLLGGTTADAMCP I LE F HAD R RAF I IT 
IKS FGTELS KEDRAKLFPHCGRLYPEGLAQLARADDYEQVKNVA 
DYYPEYKLLFEGAGSNPGDKTLEDRFFEHEVKLNKLAFLNQFHF 
GVFYAFVKLKEQECRN I VW I AEC I AQRHRAKI DN Y I P I F 


5563 


342 


1385 


SSGKNDHAAAGAAGLVRGLKAGVLSQADYLULVQCETLSDhKLH 
LQ S TDYGN PLANE AS P LT VS VI DDRLKEKMWE FRHMRNHAYE P 
LAS FLD F I T YS YM I DNV I LLI TGTLHQRS I AEL V P KCH P LG S FE 
QMEAVNI AQTPAELYNAI LVDTP LAAFFQDC I S EQDLDEMN I EI 
IRNTLYKAYLESFYKFCTLLGGTTADAMCPILEFEADRRAF I IT 
INSFGTEI^KEDRAJGjFPHCX5R1jYPEGLAQLARADDYEQVKNVA 
DY Y P E Y KLL FEG AGSN PGD KTLEDR F F EHE VKLN KLAFLNQ FHF 
GV FYAFVKLKEQECRN I VW 1 AEC I AQRHRAKI DNYI P I F 


5564 


3 


914 


RVRRDKRAVWTARGRRRCGDSMSGGWMAQVGAWRTGALGLALLL 
LLGLGLGLEAAASPLSTPTSAQAAGPSSGSCPPTKFQCRTSGLC 
VPLTWRCDRDLDCSDGSDEEECRIEPCTQKGQCPPPPGLPCPCT 
GVSDCSGGTDKKLRNCSRLACLAGELRCTLSDDCIPLTWRCDGH 
PDC PDS S DELG CGTNE I L P EGD ATTMG P P VT LE S VTS LRU ATTM 
GPPVTLESVPSVGNATSSSAGDQSGSPTAYGVIAAAAVLSASLV 
TATLLLLSWLRAQERLRPLGLLVAMKESLLLSEQKTSLP 


5565 


993 


138 


RWNS PNPARAGS I S RPQRAPGS VSAVAMTAAV F FGCAFIAFG PA 
LALYVFT I ATE PLRI I FLI AGAFFWLVSLLI SSLVWFMAR VI ID 
NKDGPTQKYLLIFGAFVSVYIQEMFRFAYYKLLKKASEGLKSIN 
PGE TAPS MRLLAYVS GLG FG I MSGVFS FVNTLS DS LG PGT VG IH 
GDS PQ FFLYSAFMTLVI I LLHVFWG I VFFDGCE KKKWG ILL I VL 
LTHLLVSAQTF ISS YYGINLAS AFI I L VLMGTWAFLAAGGS C RS 
LKLCLLCQDKNFLLYNQRSR 


5566 


2043 


i 1232 


SHIQHHGRGAQAPVKMVSWMISRAWLVFGMLYPAYYSYKAVKT 
KNVKE YVR WMMYWI VFALYTVI ETVADQTVAWFPLYYELKIAFV 
I WLLS P YTKGASLI YRKFLHPLLSS KEREI DDY I VQAKERGYET 
MVNFGRQGLNIJ^ATAAVTAAVK^OJGAITERLRSFSMHDLTTIQG 
DE PVGQR PYQPLPEAKKKS KPAPSES AG YG I PLKDGDEKTDEEA 
EGPYSDNEMLTHKGPRRSQSMKSVKTTKGRKEVRYGSLKYKVKK 
RPQVYF 


5567 


1554 


233 


EFLGSGVSPDLANEDGLTALHQCCIDDFREMVQQLLEAGANINA 
CDSECWTPLHAAATCGHLHLVELLIASGANLLAVNTDGNMPYDL 
CDDEQTLDCLETAMADRGITQDS IEAARAVPELRMLDDIRSRLQ 
AGADLHAP LDHGATLLHVAAANG F S EAAALLLEHRAS LS AKDQD 
GWE PLHAAAYWGQVPLVELLVAHGADLNAKS LMDET PLDVCGDE 
EVRAKLLELKHKHDALLRAQSRQRSLLRRRTSSAGSRGKWRRV 
SLTQRTDLYRKQHAQEAIVWQQPPPTSPEPPEDNDDRQTGAELR 
PPPPEEDNPEWRPHNGRVGGSPVRHLYSKRLDRSVSYQLSPLD 
S TTPHTLVHDKAHHTLADLKRQRAAAKLQRPP P EGPES PETAEP 
GLPGDTVTPQPDCG FRAGGD P PLLKLTAPAVEAP VERR P CCLLM 


5568 


1731 


587 


AEDRQPASRRGAGTTAAMAASGPGCRSWCLCPEVPSATFFTALL 
S LLVSGP RLFLLQQ P LAPSGLT LKS E ALRNWQ VYRL VTY I FVYE 
NP I SLLCGA 1 1 1 WRFAGNFERTVGTVRHCF FTVI FAI FS AI I FL 
S FEAVSSLS KIiGEVEDARGFTPVAFAMLGVTTVRSRMRRALVFG 
M WPSVLVPWLLLGASWLIPQTS FLSNVCGLS IGLAYGLTYCYS 
IDLSERVALKLDQTFPFSLMRR I S VFKYVSGSSAERRAAQSRKL 
NPVPGSYPTQSCHPHLSPSHPVSQTQHASGQKLASWPSCTPGHM 
PTLPPYQPASGLCYVQNHFGPNPTSSSVYPASAGTSLGIQPPTP 
VNS PGTVYSGALGTPGAAGSKESSR VPMP 


5569 


2 


835 


QTPCPLAWERGSRSEDISVPGQKPPTCSSFSGMDVGPSSLPHLG 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=*Aspartic Acid # E= 
Glutamic Acid, F«Phenylalanine, G«Glycine, 
H-Histidine, I»Isoleucine , K-Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine / V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LKLLLLLLLLPLRGQANTGCYGIPGMPGLPGAPGKDGYDGLPGP 
iUjci r vj j. rAl rU x Kvj P Kuy Kvib PvjJj PQjn PLjKNG P Mo P PGMPG VPG 
PMG I PGEPGE EGR Y KQKFQS VFTVT RQTHQP PAPNS LI R FNAVL 
TNPG<3DYDTSTGKFTCKVPGLYYFVYHASHTANLCVLLYRSGVK 
WTFCGHTS KTNQ VNSGG VLL RLQ VGE E VWI^VND YYDMVG I QG 
SDSVFSGFLLFPD 


5570 


264 


946 


RDRRDRGGVATSTE EPARPRAPQS RGPG PVS QTGRGRERGGGDT 
MS S PS PGKRRMDTDWKL I ES KHEVT I LGGLNEF WKFYGPQGT 
P YEGG VWKVR VDLPDK YP FKS PS I GFMN KI FHPN I D EAS GTVC L 
DVINQTWTALYDLTNIFESFLPQLLAYPNPIDPLNGDAAAMYLH 
RPEE YKQKI KB Y I QKYATEEALKEQEEGTGDSS SESSMS DFSED 
EAQDMEL 


5571 


264 


946 


RDRRDRGGVATSTEEPARPRAPQSRGPG PVS QTGRGRERGGGDT 
MS S PS PGKRRMDTDWKL 1 ES KHEVT I LGGLNE FWKFYGPQGT 
PYEGGVWKVRVDLPDKYPFKSPSIGFMNKIFHPNIDEASGTVCL 
DVINQTWTALYDLTNIFESFLPQLLAYPNPIDPLNGDAAAMYLH 
RPEEYKQKIKEYIQKYATEEALKEQEEGTGDSSSESSMSDFSED 
EAQDMEL 


5572 


2802 


2085 


RTDYRTGIPGRRFRVMAAGDGDVKLGTLGSGSESSNDGGSESPG 
DAGAAAEGGGWAAAALALLTGGGEMLLNVALVALVLLGAYRLWV 
RWGRRGLGAGAGAGEESPATSLPRMKKRDFSLEQLRQYDGSRtfP 
R I LLAVNGKV FDVT KGS KF YG PAG P YG I FAGRDAS RGLAT F CLD 
KDALRDEYDDLSDLNAVQMESVREWEMQFKEKYDYVGRLLKPGE 
EPSEYTDEEDTKDHNKQD 


5573 


2562 


219 


VPARTPNAEDQGPEARAATATPCQSGGRERAGEAAEDGVKMAAF 
S EMGVMPE IAQAVEEMDWLLPTDIQAE S I PL I LGGGDVLMAAET 
GSGKTGAFS I P VI Q I VY E TLKDQQEGKKGKTT I KTGAS VLN KWQ 
MNPYDRGSAFAIGSDGLCCQSREVKEWHGCRATKGLMKGKHYYE 
VS CHDQGLCR VG WS TMQAS LD LGTDKFG FGFGG TGKKSHNKQFD 
NYGEEFTMHDT IGCYLD I DKGHVKFS KNGKDLGLAFE IP PHMKN 
Q AL FP ACVLKNAE LKFN FGEEEFKF P P KDG FVALS KAPDG Y I VK 
S QHSGNAQVTQTKFLPNAPKALI VE PSRELAEQTLNNI KQFKKY 
IDNPKLRELLIIGGVAARDQLSVLENGVDIWGTPGRLDDLVST 
GKLNLSQVRFLVLDEADGLLSQGYSDFINRMHNQIPQVTSDGKR 
LQVIVCSATLHS FDVKKLS EKIMHFPTWVDLKGEDSVPDTVHHV 
WPVNPKTDRLWERLGKSHI RTDDVHAKDNTR PGANS PEMWS EA 
IKILKGEYAVRAIKEHKMDQAIIFCRTKIDCDNLEQYFIQQGGG 
P DKKGHQFS CVCLHGDRKPHERKQNLERFKKGD VRFL I CTD VAA 
RG I D I HGVP YV INVTLPD E KQNYVHR I GR VGRAERMGLAI S LVA 
TEKEKVWYHVCSSRGKGCYNTRLKEDGGCTIWYNEMQLLSEIEE 
HLiN C. r 1 by Vh PD I K.V P V DE FDGKVT YGQKRAAGGG S YKGHVD I L 
APTVQELAALEKEAQTSFLHLGYLPNQLFRTF 


5574 


1731 


952 


NEGLE VFKEQELQ PEDKGAVPEDASTERSAMAS LGLQLVGY ILG 
LLGLLGTLVAMLLPS WKTS SYVGAS I VTAVGFS KGLWMECATHS 
TGI TQ CDIYSTLLGLPRD I QAAQAMMVTSSA I SS LACI IS WGM 
RCTV F CQE S RAKDRVAVAGG VFFI LGGLLGF I P VAWNLHG I LRD 
FYSPLVPDSMKFEIGEALYLGIISSLFSLIAGIILCFSCSCQRN 
RSNYYDAYQAQPLATRSSPRPGQPPKVKSEFNSYSLTGYV 


5575 


456 


766 


LLWAL P C P PPT AAAVLLS S TGLMELLE KMLALTLAKADS PRTAL 
SPDIGRNSPHYLMFP 


5576 


249 


2146 


RSWGAPWFWRMRIiRRRHMPLRIAMVGCAFVLFLFLLHRDVSSR 
EEATEKPWLKSLVSRKDH\^DLMLEAMNNLRDSMPKLQIRAPEA 
QQTLFSINQSCLPGFYTPAELKPFWERPPQDPNAPGADGKAFQK 
SKWTPLETQEKEEGYKKHCFNAFASDRISLQRSLGPDTRPPECV 
DQKFRRCPPLATTSVIIVFHNEAWSTLLRTVYSVLHTTPAILLK 
EI ILVDDASTEEHLKEKLEQYVKQLQWRWRQEERKGLITARL 
LGAS VAQAE VLT FLDAHCE C FHGWLE P LLAR I AE DKTWVS PD I 
VT I DLNTFE FAKP VQRGRVHS RGNFD WS LTFGWBTLP PHE KQRR 
KDETYP I KS PTFAGGLFS ISKS YFEHI GTYDNQME I WGGENVEM 
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ID 
NO: 
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nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine f C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S -Serine, T-Threonine , V-Valine, 
W-Tryptophan, Y=*Tyrosine, X=Unknown, *=£top 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








S FRVWQCGGQLEI I P CS WGHVFRTKS PHTFPKGTSVIARNQVR 
LAEVWMDSYKKIFYRRNLQAAKMAQEKSFGDISERLQLREQLHC 
HNFSWYLHNVYPEMFVPDLTPTFYGAI KNLGTNOCLDVGENNRG 
GKPLIMYSCHGLGGNQYFEYTTQRDLRHNIAKQLCLHVSKGALG 
LG S CHFTG KNS Q VP KD EEWBLAQDQL I RNS GSGTCLTS QD KKP A 
MAPCNPSDPHQLWLFV 


5577 


3 


1275 


RNSDCSCGEISVHCLPWVLFILDLKVESSMFCPLKLILLPVLLD 
YSLGLNDLNVSPPELTVHVGDSALMGCVFQSTEDKCIFKIDWTL 
S PGEHAKDE YVLYYYSNLSVP IGRFQNRVHLMGD I LCKDGS LLL 
QVVQEADQGT YI CE IRLKGESQ VFKKA WLHVLPEE P KELMVHV 
GGLI QMG CVFQSTE VKHVTKVEW I FSGRRAKEE 1 VFR YYH KLR M 
S VE YSQS WGHFQNRVNLVGD I FRNDGS I MbQG VRES DG GNYTCS 
IHLGNLVFKKTIVLHVSPEEPRTLVTPAALRPLVLGGNQLVI IV 
GI VCATI LLLPVLI LI VKKTCGNKSSVNSTVLVKNTKKTNPE I K 
EKPCHFERCEGEKHIYSPIIVREVIEEEEPSEKSEATYMTMHPV 
WPSLRSDRNNSLEKKSGGGMPKTQQAF 


5578 


3 


783 


AVES MAS PGAGRAP PELPERNCGYREVE YWDQRYQGAADS AP YD 
W FGDFS S FRAIiLE PE LR P EDRI LVLGCGNS ALS YELFLGGFFNV 
TSVDYSSVWAAMQARYAHVPQLRWETMDVRKLDFPSASFITVVL 
EKGTLDALIiAGERDPWTVSSEGVHTVDQVLSEVSRVLVPGGRFI 
SMTSAAPH FRTRHYAQAYYGWSLRHATYGSGFHFHLYLMHKGGK 
LS VAQLALGAQ I LS P PR P P TS PC FLQDS DHED FLS AI QL 


5579 


3 


1540 


RNSGLARGAS ALARHGGGLAGGVG WDCGACAS RCQGVMEGLLTR 
CRALPALATCSRQLSGYVPCRFHHCAPRRGRRLLiLSRVFQPQNL 
REDRVIjSIjQDKSDDLTCKSQRLMLQVGIjIYPASPGCYHLLPYTV 

rameklvrvi dqemqai ggqkvnmpsls paelwqatnrwdlmg k 
ellrlrdrhgkeyclgptheeaitaliasqkklsykqlpfllyq 
vtrkfrdeprprfgllrgrefymkdmytfdsspeaaqqtyslvc 
daycslfnklglpfvkvqadvgtiggtvshefqlpvdigedrla 
icprcsfsanmetldlsqmncpacqgpltktkgievghtfylgt 
kyssifwaqftnvogkptlaemgcyglgvtrilaaaievlsted 
cvrwpsllap yqacli ppxkgskbqaasel igql ydhi teavpq 

LHGEVLLDDRTHLTIGNRLKDANKFGYPFVI IAGKRALEDPAHF 
E VWCQNTG EVAFIiTKDGVMDLLT P VQTV 


5580 


1681 


450 


ADAGTRCIPGFWPSGAGYSAPAQRGRRSSGRMRAAAAPGLTAP 
WRLLQCCELEAGELGMAVPAAAMGPSALGQSGPGSMAPWCSVSS 
GPSRYVLGMQELFRGHSKTREFIJ^SAKVHSVAWSCDGRRLASG 
S FDKTAS VFLLE KDRLVKENNYRGHGDSVDQLCWHPSNPDLFVT 
ASGDKTIRIWDVRTTKCIATVNTKGENINICW3PDGQTIAVGNK 
DDWTFI DAKTHRSKAEEQFKFEVNE I S WNNDNNMFFLTNGNGC 
INILSYPELKPVQSINAHPSNCICIK.F DPMGKlFATGSADALiVS 
LVf DVDELVC VR C FS RLDW P VRTLS FSHDGKMLASASEDH F I D IA 
EVETGDKLWEVQCESPTFTVAWHPKRPLLAFACDDKDGKYDSSR 
EAGTVKLFGLPNDS 


5581 


54 


947 


GGGSGPRAPSATLLDTGESVAAVASGEDKGIAASAAAAAVFACS 
CSPDPQSSTMNPVYSPVQPGAPYGNPKNMAYTGYPTAYPAAAPA 
YNPSLYPTNS PS YAPEFQFLHSAYATIiLMKQAWPQNSSS CGTEG 
TFHLPVDTGTENRTYQASSAAFRYTAGTPYKVPPTQSNTAPPPY 
SPSPNPYQTAMYPIRSAYPG^NLYAQGAYYTQPVYAAQPHVIHH 

T W V\ T\ 7H D KTC T D C A T V D H D VH T3 V Th3(Z\T A MfVAUTi. CI TTM ttM Q A f5TT T . 
1 I VVUrNoXroAl I rnr V J\F\jr IS. 1 WVj v/irlvarl Vi-vo 1 inj-UIO-ryjl J-iLj 

TTPQHTAIGAHPVSMPTYRAQGTPAYSYVPPHW 


5582 


5775 


2739 


I ITNNNNVI IPLVIAYHLSGSAQARGERSPAERLMERQKRKADI 
EKGLQFIQSTLPLKQEEYEAFLLKLVQNLFAEGNDLFREKDYKQ 
ALVQYMEGLNVADYAASDQVALPRELLCKLHVNRAACYFTMGLY 
EKALEDSEKALGLDSES IRALFRKARALNELGRHKEAYECSS RC 
SIALPHDESVTQLGQEIAQKLGL.RVRKAYKRPQELETFSLLSNG 
TAAGVADQGTSNGLGSIDDIETDCYVDPRGSPALLPSTPTMPLF 
PHVLDLLAPLDSSRTLPSTDSLDDFSDGDVFGPELDTLLDSLSL 
VQGGLSGSGVPSELPQLI P VFPGGTPLLPP WGGS IPVSSPLPP 
ASFGLVMDPSKKLAASVLDALDPPGPTLDPLDLLPYSETRLDAL 
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ID 

NO: 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, OCysteine, D*Aspartic Acid, E= 
Glutamic Acid, F=phenyl alanine, G=Glycine, 
H*Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V-Valine, 
W*Tryptophan, Y-Tyroaine, X«Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 








DS FGSTRGSLDKPDS PMEETNSQDHRPPSGAQKPAPS PEPCMPN 
TALLIKNPLAATHEFKQACQLCYPKTGPRAGDYTYREGLEHKCK 
RD I LLGRLRS S EDQTWKR IRPRPTKTS FVGS YYLCKDMINKQDC 
KYGDNCTFAYHQEEIDVWTEERKGTLNRDLLFDPLGGVKRGSLT 
IAKLLKEHQGI FTFLCEICFDSKPRI ISKGTKDSPSVCSNLAAK 
HSFYNNKCLVHIVRSTSLKYSKIRQFQEHFQFDVCRHEVRYGCL 
REDSCHFAHS FI ELKVWLLQQ YSGMTHED I VQESKKYWQQMEAH 
AGKASSSMGAPRTHGPSTFDLQMKFVCGQCWRNGQVVEPDKDLK 
YCSAKARHCWT KERR VLL VMS KAKRKWVSVRPLPS IRNFPQQYD 
LCIHAQNGR KCQ YVGNCS FAHS PEERDMWT FM KENK 1 LDMQQT Y 
DMWLKKHNPGKPGEGTPISSREGEKQIQMPTDYADIMMGYHCWL 
CGKNSNS KKQWQQHIQS EKHKE KVFTSDSDASGWAFR FPMGEFR 
L CDRLQ KGKAC P DGD KCRCAHGQE E LNE WLDRRE VLKQKLAKAR 
KDMLLCPRDDDFGKYNFLLQEDGDLAGATPEAPAAAATATTGE 


5583 


3 


1265 


SSGCRQGRPGRSDRPRPPPRRHKMVKETRYYDILGVKPSASPEE 
IKXAYRKLALKYHPDKNPDEGEKFKLISQAYEVLSDPKKRDVYD 
QGGEQAI KEGGSGS PS FS S PMD I FDM FFGGGGRMARERRGKNW 
HQLSVTLEDLYNGVTKKLALQKNVICEKCEGVGGKKGSVEKCPL 
CKGRGMHIHIQQIGPGMVQQIQTVCIECKGQGERINPKDRCESC 
SGAKVI REKKI I EVHVEKGMKDGQKI LFHGEGDQEPELEPGDVI 
I VLDQKDUS VFQRRGHDL IMKMKI QLS EALCG FKKTI KTLDNR I 
LVITSKAGEVIKHGDLRCVRDEGMPIYKAPLEKGILIIQFLVIF 
PEKHWLSLE KLPQLEALL P PRQKVRITDDMDQVELKE FCPNEQN 
WRQHREAYEEDEDGPQAGVQCQTA 


5584 


3 


1265 


SSGCRQGRPGRSDRPRPPPRRHKMVKETRYYDILGVKPSASPEE 
I KKAYRKLALKYH P D KNP D EGEKFKL I S QAYEVLS DPKKRDVYD 
QGGEQAIKEGGSGSPSFSSPMD1FDMFFGGGGRMARERRGKNW 
HQhS VTLEDL YNG VTKKLALQKNVI CE KGEG VGGKKGS VE KCPL 
CKGRGMHIHIQQIGPGMVQQIQTVCIECKGQGERINPKDRCESC 
SGAKVI REKKI I EVHVEKGMKDGQKI LFHGEGDQEPELEPGDVI 
IVLDQKDHSVFQRRGHDL IMKMKI QLS EALCGFKKTIKTLDNRI 
LVI TS KAGEV I KHGDLRCVRDEGMP I YKAPLEKG I LI IQFL VI F 
PE KHWLS LEKLPQLEALLP PRQKVR I TDDMDQ VE L KEFC PNEQN 
WRQHREAYEEDEDGPQAGVQCQTA 


5585 


2619 


915 


LP AGT P E S S LHEALDQCMTALDLFLTNQFSEALS YLKPRTKESM 
YHS LTYATILEMQAMMTFDPQDILLAGNMMKEAQ'MLCQRHRRKS 
SVTDSFSSLVNRPTLGQFTEEEIHAEVCYAKCLLQRAALTFLQD 
ENMVSFIKGGIKVRNSYQTYKELDSLVQSSQYCKGENHPHFEGG 
VKLGVGAFNLTLSMLPTRILRLLEFVGFSGNKDYGLLQLEEGAS 
GHSFRSVLC^LLLCYHTFLTFVLGTGNVNIEEAEKLLKPYLNR 
YPKGA I FLFLAGR I E VI KGN I DAAIRR FEECCEAQQHWKQ FHHM 
CYWELMWCFTYKGQWKMSYFYADLLSKENCWSKATYIYMKAAYL 
SMFGKEDHKPFGDDEVELFRAVPGLKLKIAGKSLPTEKFAIRKS 
RRYFSSNPI SLPVPALEMMYI WNGYAV IGKQPKLTDGILEI ITK 
AEEMLEKGPENEYSVDDECLVKLLKGLCLKYLGRVQEAEENFRS 
I SANE KKI KYDHYL I PNALLELALLLMEQDRNEEAIKLLESAKQ 
NYKNYSMESRTHFRIQAATLQAKSSLENSSRSMVSSVSL 


558* 
5587 


2619 
1768 


915 
148 


LPAGTPESSLHEALDQCWTALDLFLTNQFSEALS YLKPRTKESM 
YHSLTYATI LEMQAMMTFDPQD I LLAGNMMKEAQMLCQRHRRKS 
S VTDS FS S L VNR PTLGQ FTEEE I HAE VC YAKCLLQRAALTFLQD 
ENMVS F I KGG I KVRNS YQTYKELDS LVQSS QYCKGENHPHFEGG 
VKLGVGAFNLTLSMLPTRILRLLEFVGFSGNKDYGLLQLEEGAS 
GH S FRS VLCVMLLLCYHTFLTFVLGTGNVN I EEAE KLL KPYLNR 
YPKGAIFLFLAGRIEVIKGNIDAAIRRFEECCEAQQHWKQFHHM 
CYWELMWCFTYKGQWKMSYFYADLLSKENCWSKATYIYMKAAYL 
SMFGKEDHKPFGDDEVELFRAVPGLKLKIAGKSLPTEKFAIRKS 
RRYFSSNPISLPVPALEMMYIWNGYAVIGKQPKLTDGILEIITK 
AEEMLEKGPENEYSVDDECLVKLLKGLCLKYLGRVQEAEENFRS 
I SANEKK I KYDHYL I PNALLELALLLMEQDRNEEAIKLLESAKQ 
NYKNYSMES RTHFR I QAATLQAKS S LENSSRSMVS S VSL 
SSAVPDGAVGRPVAVAVGGPPHSCRCRPCCLMAAIGVHLGCTSA 
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Predicted 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G*Glycine, 
H=Histidine, I=Isoleucine, Ks=Lysine, 
L»Leucine, M*Methionine , N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S«Serine, T-Threonine , V-Valine, 
W -Tryptophan, Y-Tyroaine, X=Unknown, +*Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








CVAVYKDGRAG WAND AGDRVTPA WAYS ENEE I VGLAAKQSRI 
RNI SNTVMKVKQ I LGRS SSD PQAQ KY I AESKCLVI EKNG KLRYE 
IDTGEETKFVNPEDVARLIFSKMKETAHSVLGSDANDWITVPF 
DFGEKQKNALGEAARAAGFNVLRLIHEPSAAIiLAYGlGQDS PTG 
KSNILVFKLGGTSIiSLSVMEVNSGIYRVLSTNTDDNIGGAHFTE 
TLAQYLAS EFQRS FKHDVRGNARAMMKLTNS AEVAKHSLSTLGS 
ANCFLDSLYEGQDFDCNVSRARFEIjLCSPLFNKCIEAIRGLLDQ 

ngftadd inkwlcggssri pkiiqqlikdlfpavellns i ppde 
vipigaai eagiligkenllvedslmi ecsardilvkgvdesga 
srftvlfpsgtplparrqhtlqapgsissvclelyesdgknsak 
eetkfaqwlqdldkkenglpjdiiavltmkrdgslhvtctdqet 
gkceaisieias 


558B 


3 


589 


tppppeqamvaatvaaawlllwaaacaqqeqdfydfkavnirgk 
lvslekyrgsvslwnvasecgftdqhyralqqlqrdlgphhfn 
vlafpcnq fgqqe pdsnke i es farrtys vs fpmfs ki avtgtg 
ahpafkylaqtsgkeptwnfwkylvapdgkwgawdptvsveev 
rpq i talvrkl i llkredl 


5589 


1884 


553 


lrqawheggigqtdkergaaalpgeegdptrgrslgraswesgs 
prrprspfssflprpiclslearpcsiedrrnwsligrpgapas 

GLNRSSGLWIjGPDRCRPRSRCSCRVMENPSPAAALGKALCALLL 

atlgaagqplgges icsarapakys i tftgkwsqtafpkqyplf 
rppaqwssllgaahssdysmwrknqyvsnglrdfaergeawalm 
keieaagealqsvhavfsapavpsgtgqtsaelevqrrhslvsf 
wrivpspdwfvgvdsldlcdgdrwreqaaldlypydagtdsgf 
tfsspnfatipqdtvteitssspshpansfyyprlkalppiarv 
tllrlrqsprafippapvlpsrdneivdsasvpetpldcevslw 
sswglcgghcgrlgtksrtryvrvqpanngspcpeleeeabcvp 

DNCV 


5590 


72 


896 


lcssgalrllpamvawrsaflvclafslatlvqrgsgdfddfnl 
edavketssvkqpwdhttttttnrpgttrapakppgsgldlada 
lddqddgrrkpgiggrerwnhvttttkrpvttrapantlgndfd 
ladalddrndrddgrrkpiaggggfsdkdledivgggeykpdkg 
kgdgrygsnddpgsgmvaepgtiagvasalamaligavssyisy 
qqkkfcfsiqqglnadyvkgenleawceepqvkystlhtqsae 
pppppepari 


5591 


68 


14 94 


AGSSRRAAAERLLVSAGCRSLAGRASGVLLLPAELLPGEEEAMA 
LRVTRNS KI NAENKAKI NMAG AKRVPTAP AATS KPGLR PRTALG 
DIGNKVSEQLQAKMPMKKEAKPSATGKVIDKKLPKPLEKVPMLV 
PVPVSEPVPEPEPEPEPEPVKEEKLSPEPILVDTASPSPMETSG 
CAPAEEDIiCQAFSDVILAVNDVDAEDGADPNLCSBYVKDIYAYL 
RQLEEEQAVRPKYLLGREVTGNMRAILIDWLVQVQMKFRLLQET 
MYMTVS 1 1 DRFMQNNCVP KKMLQLVGVTAMF IAS KYE EM YPP E I 
GDFAFVTDNTYTKHQIRQIV1EMKILRALNFGLGRPLPLHFLRRAS 
KIGEVDVEQHTLAKYLMELTMLDYDMVHFPPSQIAAGAFCLALK 
ILDNGEWTPrLQHYLSYTEESLLPVMQHLAKNAAMVNQGLTKHM 
TVKNKYATS KHAKI STLPQLNSALVQDLAKAVAKV 


5592 


242 


924 


YGESKDWNQKDLLSALVLTTWCLPTPIMAXSAEVKLAIFGRAG 
VGKS ALWRFLTKRFI WE YDPTLES TYRHQAT I DDE WSME I LD 
TAGQEDTIQREGHMRWGEGFVLVYDITDRGSFEEVLPLKNILDE 
I KJCP KNVTL I L VGNKADLDHS RQ VSTEEG EKliATELACAF YE CS 
ACTGEGN I TE I FYELCRE VRRRRMVQGKTRRRS STTHVKQAINK 
MLTKISS 


5593 


3 


1113 


HAS GGRAANMAAERGAGQQQSQ EMME VD RR VES EESGDEEGKKH 
SSGIVADLSEQSLKDGEERGEEDPEEEHELPVDMETINLDRDAE 
DVDLNHYRIGKIEGFEVLKKVKTLCLRQNIilKCIENLEELQSLR 
BLDLYDNQIKKIENLEALTELEILDISFNLIiRNIEGVDKLTRLK 
KL FLVNNKI SKI ENLS NLHQLQMLELGSNRI RAI ENI DTLTNLE 
S L FLGKNK I TKLQNLDAL TNLT VL S MQS NRLTKrEGLQNLVNLR 
ELYLSHNGIEVIEGLENNNKLTMLDIASNRIKKIENISHLTELQ 
EFWMNDNLLESWSDLDELKGARSLETVYLERNPLQKDPQYRRKV 
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Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P-Proline, Q»Glutamine, R-Arginine, 
S-Serine, T»Threonine, V-Valine / 
W=Tryptophan, Y=Tyrosine, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








MLALPSVRQIDATFVRF 


5594 


3 


1113 


HASGGRAANMAAERGAGQQQSQEMMEVDRRVESEESGDEEGKKH 
SSGIVADLSEQSLKDGEERGEEDPEEEHELPVDMETINLDRDAE 
DVDIiNHYRIGKIEGFEVLKKVKTLCLRQNLIKCIENLEELQSLR 
ELDL YDNQ I KK I ENLE ALTE LE I LD I S FNLLRN I EGVDKLTRLK 
KLFL VNNKI SKI ENLSNLHQ LQMLE LGSNR I RAI ENI DTLTNLE 
S L FLG KNK I T KLQNLDAL TNLT VLS MQSNRLTKI EGLQNLVNLR 
EL YLSHNG I E VI EGLENNNKLTMLD I ASNR I KKI ENISHLTELQ 
EFWMNDNLLESWSDLDELKGARSLETVYLERNPLQKDPQYRRKV 
MLALPSVRQIDATFVRF 


5595 


3 


1476 


ARWNGRWVQVPAWPGPGCGTNASGER.QRQLPRAWRPVGRTLGSE 
PIALAWSPPLYLFPIPLPSWAVSQPTPTLGTMFADLDYDIEEDK 
LG I PTVPGKVTLQKDAQNL IGIS I GGGAQYCPCLYI VQVFDNTP 
AALDG TVAAGDEI TGVNGRS I KGKTKVEVAKMIQEVKGEVT IHY 
NKLQADPKO^MSLDIVLKKVKHRLVENMSSGTADALGLSRAILC 
NDGLVKRLEELERTAELYKGMTEHTKNLLRAFYELSQTHRAFGD 
VFS V I GVRE PQPAASEAFVKFADAHRS I EKFG I RLLKT I KPMLT 
DLNTYLNKAI PDTRLTI KKYLDVKFE YLS YCLKVKEMDDEEYSC 
IALGEPLYRVSTGNYEYRLILRCRQEARARFSQMRKDVLEKMEL 
LDQKHVQDIVFQLQRLVSTMSKYYNDCYAVLRDADVFPIEVDLA 
HTTLAYGLNQEEFTDGEEEEEEEDTAAGEPSRDTRGAAGPLDKG 
GSWCDS 


5596 


698 


219 


GAVLAPSSLPAAELAAQGESQSLEDLSNTSRPTSEVYKISFIFP 

ngdkydgdctr'tssgiyerngigihttpngivytgswkddkmng 

FGRLEHFSGAVYEGQFKDNMFHGLGTYTFPNGAKYTGNFNENRV 
KGEGEYTHIQGTRMDWTFHFTSCSQT 


5597 


3 


731 


ISCKMAADGQSSLPASWRSVTLTHVEYPAGDLSGHLLAYLSLSP 
VEVIVGFVTLIIFKRELHTISFLGGLALNEGVNWLIKNVIQEPR 
P CGG P HTAVGT KYGMP S SHS Q FMW F F S VY S FLFLYLRMHQTNNA 
RFLDLLWRHVLSLGLLAVAFLVSYSRVYLLYHTWSQVLYGGIAG 
GLMAIAWFIFTQEVLTPLFPRIAAWPVSEFFLIRDTSLIPNVLW 
FEYTVTRAEARNRQRKLGTKLQ 


5598 


326 


2440 


GIGP IAAS FI FCKVASLYI FLS PPPPS VSGVP YS PANSSWS CAL 
VPLLGSGVPPHPPAPSPCCSGQTMLKMLSFKLLLLAVALGFFEG 
DAKFGERNEGSGARRRRCLNGNPPKRLKRRDRRMMSQLELLSGG 
EMLCGGFYPRLSCCLRSDSPGLGRLENKIFSVTNNTECGKLLEE 
IKCALCSPHSQSLFHSPEREVLERDLVLPLLCKDYCKEFFYTCR 
GHIPGFLQTTADEPCFYYARKDGGLCFPDFPRKQVRGPASNYXD 
QMEEYDKVEEISRKHKHNCFCIQEVVSGLRQPVGALHSGDGSQR 
LFILEKEGYVKILTPEGEIFKEPYLDIHKLVQSGIKGGDERGLL 
S LAFH PNYKKNGKL YVS YTTNQERWAI GPHDH I LRWEYTVS RK 
NPHQVDLRTARVFLEVAELHRKHLGGQLLFGPDGFLYI ILGDGM 
I TLDDMEEMDGLSDFTGS VLRLDVDTDMCNVPYS I PRSNPHFNS 
TNQPPEVFAHGLHDPGRCAVDRHPTDININLTILCSDSNGKNRS 
SARILQ 1 1 KGKDYES E PSLLE FKPFSNGPLVGGFVYRGCQS ERL 
YGSYVFGDRNGNFLTLQQSPVTKQWQBKPLCLGTSGSCRGYFSG 
HILGFGEDELGEVYILSSSKSMTQTHNGKLYKIVDPKRPLMPEE 
CRATVQPAQTLTSECSRLCRNGYCTPTGKCCCSPGWEGDFCRTG 


5599 


326 


2440 


GIGPIAASFIFCKVASLYIFLSPPPPSVSGVPYSPANSSWSCAL 
V VijLXabtj V eVHr FAFb PCLSGQTMLKMIjSFKIJ^LAVAIjGFFEG 
DAKFGERNEGSGARRRRCLNGNP PKRLKRRDRRMMSQLELLSGG 
EMLCGGFYPRLSCCLRSDSPGLGRLENKIFSVTNNTECX3KLLEE 
I KCALCS PHSQSLFHS PERE VLERDL VL PLLCKD YCKE F FYTCR 
GH I PG FLQTT AD E FC F YYARKDGGLC F P D FPRKQVRG P ASNYLD 
QMEEYDKVEEISRKHKHNCFCIQEVVSGLRQPVGALHSGDGSQR 
LFILEKEGYVKILTPEGEIFKEPYLDIHKLVQSGIKGGDERGLL 
S LAFHPNYKKNGKL YVS YTTNQERW A I G P HDH I LR WE YTVSR K 
NPHQVDLRTARVFLEVAELHRKHLGGQLLFGPDGFLY I ILGDGM 
I TLDDMEEMDGLS DFTGS VLRLDVDTDMCNVPYS I PRSNPHFNS 
TNQPPEVFAHGLHDPGRCAVDRHPTDININLTILCSDSNGKNRS 
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ID 
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location 
corresponding 
to first 
amino acid 
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amino acid 
sequence 


Predicted end 
nucleotide 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L=Leucine / M=Methionine, N-Asparagine, 
P«Proline, Q-Glutamine, R=Arginine, 
S«Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SARILQIIKGKDYESEPSLLEFKPFSNGPLVGGFVYRGCQSERL 
YGSYVFGDRNGNFLTLQQSPVTKQWQEKPLCLGTSGSCRGYFSG 
HILGFGEDELGEVYILSSSKSMTQTHNGKLYKIVDPKRPLMPEE 
CRAT VQ PAQTLT S E CSRLCRNG YCT P TGKCCCS PGW EGDFCRTG 


5600 


1977 


1244 


S LR VLS GHLMQTRDL VQP D KPAS PKF I VTLDG V P S P PG YM S DQE 
EDMCFEGMKPVNQTAASNKGLRGLLHPQQLKLLSRQLEDPNGSF 
SNAEMSELSVAQKPEKLLERCKYWPACKNGDECAYHHPISPCKA 
FPNCKFAEKCLFVHPNCKYDAKCTKPDCPFTHVSRRIPVLSPKP 
AVAPPAPPSSSQLCRYFPACKKMECPFYHPKHCRFNTQCTRPDC 
TFYHPTINVPPRHALKWIRPQTSE 


5601 


1977 


1244 


SLRVLSGHLMQTRDLVQPDKPASPKFIVTLDGVPSPPGYMSDQE 
EDMCFEGMKPVWQTAASNKGLRGLLHPQQLHLLSRQLEDPNGSF 
SNAEMS ELS VAQKPEKLLERCKYWPACKNGDE CAYHHP IS PCKA 
FPNCKFAEKCLFVHPNCKYDAKCTKPDCPFTHVSRRIPVLSPKP 
AVAPPAPPSSSQLCRYFPACKKMECPFYHPKHCRFNTQCTRPDC 
TFYHPTINVPPRHALKWIRPQTSE 


5602 


246 


766 


YHTS CT VWRTAKEALENTEVPVGCLM VYNNEWGKGRNEVNQTK 
NATRHAEMVAI DQVLDWCRQSGKS PS E VFEHTVLYVTVEPC IMC 
AAALRLMKIPLVVYGCQNERFGGCGSVLNIASADLPNTGRPFQC 
I PGYRAEEAVEML KTFYKQENPNAPKS KVRKKE CQQ I LNMF 


5603 


1 


565 


FRGRT P I SGGERGCAQYP I PATPARSGENRTMPGAGDGGKAPAR 
WLGTGLLGLFLLP VTLSLE VS VGKATD I YAVNGTE I LLPCTFS S 
CFGFBDLHFRWTYNSSDAFKILIEGTVKNEKSDPKVTLKDDDRI 
TLVGSTKEKRNNIS IVLRDLEFSDTGKYTCHVKNPKENNLQHHA 
T I FLQ WDRRMQ 


5604 


1 


1506 


EDIFPAQLLKLQRHERVWQQEPPVRDHRSWGGSGAGGVAGREWT 
DQGQVALGGHYMAEGEGYFAMS EDEIACS P YI PLGGD FGGGDFG 
GGDFGGGDFGGGDFGGGGS FGGHOiDYCES PTAHCNVLNWEQVQ 
RLDG I L S ET I P I HGRGNF P TLE LQP S L I VKWRRRLAE KR I G VR 
DVRLNGSAASHVLHQDSGLGYKDLDL I FCADLRGEGEFQTVKDV 
VLDCLIJ)FLPEGVNKEKITPLTLKEAYVQKMVKVCNDSDRWSLI 
SLSNNSGKNVELKFVDS LRRQFEFSVDS FQ I KLDS LLLF YECSE 
NPMTETFHPTIIGESVYGDFQEAFDHLCNKIIATRNPEEIRGGG 
LLKYCNLLVRGFRPASDEIKTLQRYMCSRFFIDFSDIGEQQRKL 
ESYLQNHFVGLEDRKYEYLMTLHGWNESTVCLMGHERRQTLNL 
I TMLA I R VLADQNV I PNVANVT C YYQ PAP Y VAD ANF S N YY I AQ V 
QPVFTCQQQTYSTWLPCN 


5605 


35 


1821 


SQRSCPRSPSSPAPPWARCSNPDSRTGGVPVPRAWSAGGPALGL 
MAAPVRLGRKRPLPACPNPLFVRWLTEWRDEATRSRHRTRFVFQ 
KALRS LRR Y P LPLRSGKE AKI LQH FGDGL CRMLDERLQRHRTS G 
GDHAPDSPSGENSPAPQGRIAEVQDSSMPVPAQPKAGGSGSYWP 
ARHSGARVI LLVLYREHLNPNGHHFLTKEELLQRCAQKS PRVAP 
G S ARP WP ALRS LLHRNL VLRTHQ PAR YS LT PEGLELAQKLAESE 
GLS LLNVG I GPKEP PGEETAVPGAAS AELASEAGVQQQPLELRP 
GEYRVI^CVDIGETRGGGHRPELLREI^RLHVTHTVRKLHVGDF 
VWVAQETNPRDPANPGELVLDHI VERKRLDDLCSS I IDGRFREQ 
KFRLKRCGLERRVYLVEEHGS VHNLS LPESTLLQAVTNTQVI DG 
FFVKRTADIKESAAYLALLTRGLQRLYQGHTLRSRPWGTPGNPE 
S GAMTS PNPLCS LLTFS DFNAGAI KNKAQS VREVFARQLMQVRG 
VSGEKAAALVDRYSTPASLLAAYDACATPKEQETLLSTIKCGRL 
QRNLGPALSRTLSQLYCSYGPLT 


5606 


3 


1099 


GRSRCPGPGARGGTMSPRSCLRSLRLLVFAVFSAAASNWLYLAK 
LSSVGSISEEETCEKLKGLIQRQVQMCKRNLEVMDSVRRGAQLA 
I EECQYQFRNRRWNCS TLDS LP VFGKWTQGTREAAFVYAI S SA 
GVAFAVTRACSSGELE KCGCDRTVHG VS PQG FQWSGCSDNI AYG 
VAFSQS FVDVRERS KGAS SS RALMNLHNNEAGRKAI LTHMRVEC 
KCHG VS G S CE VKT CWRAVP P FRQVGHALKEKFDGATEVE PRRVG 
SSRALVPRNAQFKPHTDEDLVYLEPSPDFCEQDMRSGVLGTRGR 
TCNKTSKAIDGCELLCCGRGFHTAQVELAERCSCKFHWCCFVKC 
RQCQRLVELHTCR 
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Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q^Glut amine, R-Arginine r 
S^Serine, T=Threonine, V-Valine, 
W- Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5607 


521 


141 


PPVC^PAEAMPSPGWCSLLLLGMLWLJDLAMAGSSFLSPEHQRV 
QQRKESKKPPAKLQPRA1AGWLRPEDGGQAEGAEDELEVRFNAP 
FDVGIKLSGVQYQQHSQALGKFLQDILWEEAKEAPADK 


5608 


2 


983 


WFQSPLRQADPGPPRHTLFMDFVAGAIGGVCGDAVGYPLDTVKV 
R I QTEPKYTG I WHC\TRDT YHRBRVWGFYRGLLLPVCTVSLVS SE 
VFGTYRHCLAHI CRLRFGNPDAKPTKAD I TLSGCAS GLVRVFLT 
SPTEVAKVRLQrQTQAQKQQRRLSASGPLAVPPMCPVPPACPEP 
KYRGPLHCLATVAREEGLCGLYKGSSALVLRDGHSFATYFLSYA 
VLCEWLS PAGHSRPDVPGVLVAGGCAGVLAWAVATPMDVI ksrl 
QADGQGQRRYRGLLHCMVTIVREEGPRVLFKGLVLNCCRAFPVN 
MWFVAYEAVLRLARGLLT 


5609 


1628 


304 


AKGVWVLPSPPPRPGRGALVSGSGLRRGRSGTSWRPRRMNHKSK 
KR I REAKRSAR PELKDS LDWTRHN YYES FS LS PAAVADNVERAD 
ALQL.S VEE FVERYERP YXP WLLNAQEGWS AQE KWTLERLKRKY 
RNQKFKCGEDNDGYSVKMKMKYYI E YMESTRDDS PLYI FDSSYG 
EHPKRRKLLEDYKVPKFFTDDLFQYAGEKRRPPYRWFVMGPPRS 
GTGIHIDPLGTSAWNALVQGHKRWCLFPTSTPRELIKVTRDEGG 
NQQDEAITWFNVIYPRTQLPTWPPEFKPLE1LQKPGETVFVPGG 
WWHVVLNLDTT I AITQNFAS STNFPWWHKTVRGRPKLSRKWYR 
ILKQEHPELAVLADSVDLQESTGIASDSSSDSSSSSSSSSSDSD 
SB CES GSEGDGTVHRRKKRRTCS MVGNGDTTS QDDCVS KE RS SS 
R 


5610 


54 


1196 


LERTPASADMAWTKYQLFLAGLMLVTGSINTLSAKWADNFMAEG 
CGGSKEHSFQHPFLQAVGMFLGEFSCLAAFYLLRCRAAGQSDSS 
VDPQQPFNPLLFLPPALCDMTGTSLMYVALNMTSASSFQMLRGA 
VI I FTGLFS VAFLGRRLVLSQWLG I LATIAGLVWGLADLLS KH 
DSQHKLSEVITGDLLIIMAQIIVAIQMVLEEKFVYKHNVHPLRA 
VGTEGLFGFVI LS LLLVPM YYI PAGS FSGNPRGTLEDALDAFCQ 
VGQQPLIAVALLGNISSIAFFNFAGISVTKELSATTRMVLDSLR 
T W I WALSLALGWE AFHALQ I LG FL I LL I GTAL YNGLHRP LLGR 
LSRGRPLAEESEQERLLGGTRTPINDAS 


5611 


2 


577 


FVLPNRLG I PGS TFRGPGACAS SS SLAAS AKPGAGGS PALAMSG 
ELSNRFQGGKAFGLLKARQERRLAEINREFLCDQKYSDEENLPE 
KLTAFKEKYME FDLNNEGE I DLMS LKRMMEKLGVP KTHLE M KKM 
ISEVTGGVSDTISYRDFVNMMLGKRSAVLKLVMMFEGKANESSP 
KPVGPPPERDIASLP 


5612 


1 


721 


ASRDGYMDATI APHRI PPEMPQYGEENHI FELMQAMWLCKHLNS 
SLLTLENLILNEFSYTATEARRLYLQRKTVPSALLVQLIQERLA 
EEDCIKQGWILDGIPETREQALRIQTLGITPRHVIVLSAPDTVL 
IERNLGKRIDPQTGEIYHTTFDWPPESEIQNRLMVPEDISELET 
AQKLLEYHRNI VRVI PS YPKI LKVIS ADQPCVDVFYQAIiTYVQS 
NHRTNAPFTPRVLLLGPVGS 


5613 


115 


1279 


RG VD PALRRAE KML P LS I KD DE YKPP KFNL FGK I SGWFRS I LS D 
KTSRNLF F FLCLNLS FAFVELLYGIWSNCLGLISDS FHMFFDST 
AIIAGl^AASVISKWRDNI^FSYGYVRAEVLAGFVNGLFLIFTAF 
FI FSEGVERALAPPDVHHERLLLVSILGFWNLIGI FVFKHGGH 
GHSHGSGHGHSHSLFNGALDQAHGHVDHCHSHEVKHGAAHSHDH 
AHGHGHFHSHDGPSLKETTGPSRQILOGVFLHILADTLGSIGVI 
AS AIMMQNFGLMIADP I CS I L I AILI WS VI PLLRE S VGILMQR 
T P P LLENS LPQC YQRVQQLQG V YS LQE QH F WTLC S D VYVGTLKL 
IVAPDADARWILSQTHNIFTQAGVRQLYVQIDFAAM 


5614 


3 


1268 


LLSRNBHACPLQAGLGLTQRKPKAIRGREGRATNQGQGErQNER 
APWGARQRLG VMAELQQLQE FE I PTGREALRGNHSALLRVAD YC 
E DNYVQATD KRKALE E TMAFTTQALAS VAYQ VGNLAGHTLRM LD 
LQGAALRQVEARVSTLGQMVNMHMEKVARREIGTIiATVQRLPPG 
QKVIAPENLP PLTP Y CRRPLNFG CLDD I GHG I KDLS TQLS RTGT 
LSRKS I KAPATPAS ATLGRP PR I PEP VHLP WPDGRLSAASSAS 
SLASAGSAEGVGGAPTPKGQAAPPAPPLPSSLDPPPPPAAVEVF 
QRPPTLEELSPPPPDEELPLPLDLPPPPPLDGDELGLPPPPPGF 
GPDEPSWPASYLEKVVTLYPYTSQKDNELSFSEGTVICVTRRY 
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corresponding 
to first 
amino acid 
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amino acid 
sequence 


Predicted end 
nucleotide 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X- Unknown, *-Stop 
Codon, /=pos6ible nucleotide deletion, 
\-poseible nucleotide insertion) 








SDGWCEGVSSEGTGFFPGNYVEPSC 


5615 

cdi r 


9 


1558 


ALGRRRPGDPREMEAAATPAAAGAARREELDMDVKRPLINEQNF 
DGTS D EEHE QELL P VQKHYQLDDQEG I S FVQTLMHLLKGNI GTG 
LLGLPLAIKNAGIVLGPISLVFIGIISVHCMHILVRCSHFLCLR 
FKKSTLGYSDTVSFAMEVSPWSCLQKQAAWGRSWDFFLVITQL 
GFCSVYIVFLAENVKQVHEGFLESKVFISNSTNSSNPCERRSVD 
LRI YMLCFLPFI I LLVFIRELKNLFVLS FLANVSMAVS LVI I YQ 
YWRNMPDPHNLPIVAGWKKYPLFFGTAVFAFEGIGWLPLENQ 
MKES KRFPQALN IGMG I VTTL YVTLATLG YMCFHDE I KGS I TLN 
LPQDVWLYQS VKILYSFGI FVTYS IQFYVPAE 1 1 1 PG I TSKFHT 
KW KQ I CE FG I RS FL VS I T CAGAI LIPRLDIVIS FVG AVSSSTLA 
LILPPLVEILTFSKEHYNIWMVLKNISIAFTGWGFLLGTYITV 
EEIIYPTPKWAGTPQSPFLNLNSTCLTSGIjK 


5616 


1 


719 


DDFVRCGPQSAAMGASARLLRAVIMGAPGSGKGTVSSRITTHFE 
LKHLS SGDLLRDNMLRGTE IGVLAKAFI DQGKL I PDDVMTRLAL 
HELKNLTQ YS WLLDGFPRTLPQAEALDRAYQ I DTVINLNVPFEV 
I KQRLTARW IHPASGR VYNIEFNPPKTVGIDDLTGE PL I QREDD 
KPETVIKRLKAYEDQTKPVLEYYQKKGVLETFSGTETNKIWPYV 
YAFLQTKVPQRSQKASVTP 


5617 


176 


765 


PWRGRGSRPRGAGAMAEEQVNRSAGLAPDCEASATAETTVSSVG"" 
TCEAAGKSPEPKDYDSTCVFCRIAGRQDPGTELLHCENEDLICF 
KD I KP AATHHYLW P KKH I GNCRTLR KDQ VE L VENMVT VGKT I L 
ERNNFTDFTNVRMGFHMPPFCSISHIiHLHVIAPVDQIjGFLSKLV 
YR VNS YW FI TADHL I E KLRT 


5618 

• 


3 


1692 


YLNYINLKSENKLSGKEDLWEKLQYLWKSTLNLPEDLLRVPDES 
L FLN S GGDS LKS I RLLS E I EKL VGTS VPGLL E 1 1 LSSS I LE I YN 
H I LQTWPDED VTFRKS CATKRKLSN INQEEASGTSLHQKAI MT 
FTCHNEINAFWLSRGSQILSLNSTRFLTKLGHCSSACPSDSVS 
QTN IQNLKGLNS PVLIGKSKDPS C VAKVS E EG KPAIGTQ KME LH 
VRWRSDTGKCVDASPLWIPTFDKSSTTVYIGSHSHRMKAVDFY 
SGKVKWEQILGDRIESSACVSKCGNFIWGCYNGLVYVLKSNSG 
EKYWMFTTEDAVKSSATMDPTTGLIYIGSHDQHAYALDIYRKKC 
VWKSKCGGTVFS S PCLNLI PHHLYFATU3GLLLAVNPATGNVI W 
KHSCGKPLFSSPQCCSQYICIGCVDGNLLCFTHFGEQVWQFSTS 
GP I FS S P CTS P S E QKI F FG SHDCF I YCCNMKGHLQWKFE TTS R V 
YATPFAFHNYNGSNEMLLAAAS TDGKVWI LESQSGOLQSVYELP 
GEVFSS P WLESMLI IGCRDNYVYCLDLLGGNQK 


5619 


2160 


1477 


DSPVLPTSGNVISTAQPAQPWSAVEAALRSLGSPPGAGRGCPCP 
AQS LHSHQIjAAWD p lkp s LRS Y PPHLLQHPQLRS LTASSGHLGR 
RS CPQ PR P LE ELLRAGS S TRPQ PLT S S CCGMS CM YS FLGHCS VL 
LWGTKGRGSGS PSS PGCCLHPPAQHSQDLPLVHVDVGWQPPLGP 

TVGLRPGIiLGERQRGALRAGDPQCQCPLPATVREDLGVPSPWAA 
ECSPPATP 


5620' 


930 


182 


PLPPPTLAMFLTRSEYDRGVNTFSPEGRLFQVEYAIEAIKLGST • 
AIGIQTSEGVCLAVEKRITSPLMEPSSIEKIVEIDAHIGCAMSG 
L I ADAKTL I DKARVETQNHWFTYNETMTVESVTQAVSNLALQFG 
E E DAD PG AMS RP FGVALL FGG VD EKG PQL FHMDPS GTFVQCDAR 
AIGSASEGAQSSLQEVYHKSMTLKEAIKSSLIILKQVMEEKLNA 
TNIELATVQPGQNFHMFTKEELEEVIKDI 


5621 


3 


819 


v vcir vbi lAiLJ/^vj^boijbbV^i^iKMTVRYGKFLSLIjKDGA 
ENDLTWVLKHCERFLKQQQTSI KSSLLCLQGNYAGHDWFVSS LF 
MIMLGDKEKTFQFLHQFSRLLTSAFLWLPRLHISSYLPNDTVES 
G I HPVYFCSTHYI EMLLKAELPLVFS AFHMSGFAPSQ I CLQW I T 
QC FWN YLD W I E I CH Y I ATCVFLGPDYQVY I C IAVFKHLQQDI LQ 
HTQTQDLQ VFLKEEALHGFRVS D YFE YME I LEQNYRTVLLRDMR 
NIRLQST 


5622 


1122 : 


456 


AASTKDAVSRKRSHSASEKSGTGTS ISKRLNMNPQ 1RNPMKAM Y 
PGTFYFQFKNLWFJVNDRNETWLCFTVEGI KRRSVVSWKTGVFRN 
QVDSETHCHAERCFLSWFCDDILSPNTKYQVTWYTSWSPCPDCA 
GEVAEFLARHSNVNLTIFTARLYYFQYFCYQEGLRSLSQEGVAV 
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Amino acid segment containing signal peptide 
(AeAlanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine , G-Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N*Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyroeine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








EIMDYEDFKYCWENFVYNDNEPFKPWKGLKTNFRLLKRRLRESL 

Q 


5623 


3 


954 


FLPFFIRAPKt SRNGQWL FTFTTP FP FANKAL PG WEG I VP ACFW 
RKK I LTPS TGTME LLQ VT I L FLL P S I CS SNS TGVLEAANNS LW 
TTTKPS ITTPNTESLQKNWTPTTGTTPKGT ITNELLKMS LMST 
ATFLTSKDEGLKATTTDVRKNDS 1 1 SNVTVTSVTLPNAVSTLQS 
S KPKTETQSS I KTTEI PGSVLQPDASPSKTGTLTSIPVTI PENT 
SQSQVIGTEGGKNASTSATSRSYSSIILPWIALIVITLSVFVL 
VGL YRMCW KAD PG TPENGNDQPQS DKE S VKL LT VKT I SHE S GEH 
SAQGKTKN 


5624 


159 


898 


PGVAAAAGALPQYHGPAPALVSCRRELSLSAGSLQLERKRRDFT 
S SGS RKLY FDTHAL VCLLEDNG FATQQAE 1 1 VS AL VK I LEANMD 
IVYKDMVTKMQQEITFOQVMSQIANVKKDMIILEKSEFSALRAE 
NEKI KLELHQLKQQVMDEVI KVRTDTKLDFNLEKSRVKELYSLN 
EKKLLELRTEIVALHAQGDRALTQTDRKIETEVAGLKTMLESHK 
LDNIKYLAGS I FTCLTVALGF YRLW I 


5625 


1 


1180 


TIPS S AAAQRAG P PAGALE ALS PGGARAHAERRGEMRATPLAAP 
AGSLSRKKRLELDDNLDTERPVQKRARSGPQPRLPPCLLPLSPP 
TAPDRATAVATASRLGPYVLLEPEEGGRAYQALHCPTGTEYTCR 
WPVQEALAVLEPYARLPPHKHVARPTEVIAGTQLLYAFFTRTH 
GDMHS LVRSRHR I PEPEAAVXiFRQMATALAHCHQHGLVLRDLKL 
CR FVF ADRERKKLVLENLEDS CVLTG PDDSLWD KHACP A YVG PE 
ILSSRASYSGKAADVWSLGVALFTMLAGHYPFQDSEPVLLFGKI 
RRGAYALP AGLSAPARCLVRCLLRRE PAERLTATGI LLH PWLRQ 
DPMPLAPTRSHLWEAAQWPDGLGLDEAREEEGDREWLYG 


5626 


3123 


2011 


P PRALGS VAMENQVLTPHVYWAQRHRELYLRVELSDVQNPAI S I 
TE NVL H FKAQGHGAKG DNVYE FHLE FLD LVKPE PVYKLTQRQVN 
ITVQKKVSQWWERLTKQEKRPLFLAPDFDRWLDESDAEMELRAK 
EEERLNKLRLES EGS P ETLTNLRKGYLFMYNLVQFLGFSW I FVN 
LTVRFCILG KES FYDT FHT VADMMYFCQMLAWE T I NAAIGVTT 
SPVLPSLIQLlLGRNFILFIIFGTMEEMQNKAWFFVFYLWSAIE 
I FRYS FYMLTC I DMDWKVLTWLRYTLW I PL YPLGCLAEAVS V I Q 
S I PIFNETGRFS FTLPYPVKIKVRFSFFLQ I YLIMI FLGLYINF 
RHLYKQRRRRYGQKKKKIH 


5627 


3123 


2011 


P PRALG S VAMENQVLT PHVYWAQRHRE L YLRVEL SDVQNPAI S I 
TENVLHFKAC3GHGAKGDNVYEFHLEFLDLVKPEPVY1CLTQRQVN 
I T VQKKVS QWWERLTKQEKRP LFLAPD FDRWLDES D AEMELRAK 
EE ERLNKLRLESEG S PETLTNLR KG YL FM YNLVQ FLG FS W I FVN 
LTVRFC I LGKES FYDTFHTVADMMYFCQMLAWETINAAIGVTT 
S P VLPS LI QLLGRN F I L FI I FGTM E EMQNKAWFF VF YLWS A I E 
I FRYSFYMLTCIDMDWKVLTWLRYTLWI PLYPLGCLAEAVSVIQ 
SI PIFNETGRFSFTLPYPVKIKVRFSFFLQI YLIMI FLGLYINF 
RHLYKQRRRRYGQKKKKIH 


coo . 


75 


1455 


VAGAMASKCLKAGFSSGSLKSPGGASGGSTRVSAMYSSSPCKLP 
SLSPVARSFS ACS VGLGRSSYRATSCLPALCLPAGG FATS YSGG 
GGWFGEGILTGNEKETMQSLNDRLAGYLEKVRQLEQENASLESR 
I REWCEQQVPYMCPDYQS YFRTI EELQKKTLCS KAENARLWE I 
DNAKLAADDFRTKYBTEVSLRQLVESDINGLRRILDDLTLCKSD 
LEAQVESLKEELljCLKKNHEEEVNSLRCQLGDRLNVEVDAAPPV 

QLQSCQAEIIELRRTVNALEIELQAQHSMRDALESTLAETEARY 
SSQLAQMQCMITNVEAQLAEIRADLERQNQEYQVLIJDVRARLEC * 
EINTYRGLLESEDSKLPCNPCAPDYSPSKSCLPCLPAASCGPSA 
ARTNCSARP I CVPCPGGRF 


5629 


2287 


938 


GRPRSSSDNRNFLRERAGLSSAAVQTRIGNSAASRRSPAARPPV 
PAP PALPRGR PGTEG S TS LS AP AVL WAVAVVVVVVSAVAWAMA 
NYIHVPPGSPEVPKLNVTVQDQEEHRCREGALSLLQHLRPHWDP 
QE VTLQLFTDG I TNKL I GCYVGNTMED WLVRI YGNKTE LLVDR 
DEEVKSFRVLQAHGCAPQLYCTFNNGLCYEFIQGEALDPKHVCN 
PAI FRL IARQLAKIHAIHAHNGWI PKSNLWLKMGKYFSL I PTGF 
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Ammo acid segment containing signal peptide 
{A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidlne, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine , N^Asparagine , 
P=Proline f Q=Glutamine, R=Arginine, 
S-Serine, T-Threonine, V=Valine, 
^Tryptophan, Y-Tyrosine, X= Unknown, *=StOp 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ADEDI^FLSDIPSSQlLQEEMTWMKKILSNU3SPVVLCHNDrr~ 

LCKNIIYNEKQGDVQFIDYEYSGYNYLAYDIGNHFNEFAGVSDV 

DYSLYPDRELQSQWLRAYLEAYKEFKGFGTEVTEKEVEILFIQV 

NQFALASHFFWGLWALIQAKYSTIEFDFLGYAIVRFNQYFKMKP 
EVTALKVPE 


5630 


1194 


278 


GFWAIAQTCAHHLPPGS PWLVPAS PWRLPEMSS FGYRTLTVALF - 
TLICCPGSDEKVFEVHVRPKKLAVEPKGSLEVNCSTTCNQPEVG 
GLETSLDKILLDEQAQWKHYLVSNISHDTVLQCHFTCSGKQESM 
NSNVSVYQPPRQVILTLQPTLVAVGKSFTIECRVPTVEPLDSLT 
L FLFR GNE TLHYE TFG KAAP APQEATATFNS TADR EDGHRNFSC 
LAVLDLMSRGGNI FHKHSAPKMLE I YEPVSDSQMVI IVTWSVL 
LS LFVTS VLLCF I FGQHLRQQRMGTYG VRAAWRRLPQAFRP 


5631 


1053 


290 


SRVDDFVRPEPSRAEPSRSGRRRPARRAATMSVFGKLFGAGGGK 
AGKGGPTPQEAIQRLRDTEEMLSKKQEFLEKKIEQELTAAKKHG 
TKNKRAAI^ALKRKKRYEKQIAQIDGTLSTIEFQREALBNANTN 
TEVLKNMGYAAXAMKAAHDNMDIDKVDELMQDIADQQBIiAEEIS 
TAISKPVGFGBEFDEDELMAELEELEQEELDKNLLEISGPETVP 
LPNVPS IALPSKPAKKKEEEDDDMKELENWAGSM 


5632 


3 


952 


WLGWS P PRRLWWG S LGAAQR P AVP VS GLARS LHVETRR PHRRA ' 
SVRVARGRLGVWAQPQPLLPRPVGSRREMQPPGPPPAYAPTNGD 
FTFVS S ADAEDLSGS IAS PDVKLNLGGDFIKES TATTFLRQRG Y 
GWLLEVEDDDPEDNKPLLEELDIDLKDIYYK1RCVLMPMPSLGF 
NRQ WRDNPD FWG P LA WLFFSM I S L YGQ FRWS WIITIWIFGS 
LT I FLLAR VLGGE VAYGQVLG V I G YS LL PL IVI AP VLL WGS FE 

WSTLIKLFGVFWAAYSAASLLVGEEFKTKKPLLIYPIFLLYIY 
FLSLYTGV 


5633 


771 


460 


QGCSKTMSVGRPFYRSSEFMEQLLSSHLHQVPFFCCFTWCLCN 
CLFENS VSKLYMLCFNFFMS I FFYSLS ITKLNLI YLWGLS YQSL 
LLLLLSGHRPWGSSMV 


5634 


1446 


855 


PRATGR IRS RAAAS R PRAG AG ASGAE PRS GRERS RLS GRRAPAM 
ARNTLS SRFRRVD I DEFDENKFVDEQEEAAAAAAEPGPDPS EVD 
GLLRQGDMLRAFHAALRNSPVNTKNQAVKERAQGVYLKVLTNFK 
S S E I EQAVQSLDRNG VDLLMKY I YKGFEKPTENS SAVLLQWHEK 
ALAVGGLGSIIRVLTARKTV 


5635 


3 


• 943 


DRGPRSTATDTGRARVSFWRFPLDPGVK^SNVQISGEKRRFRTL 
RS LFHPFPVTRSGAPRAVLVGS S WPAKMVAPAVKVARGWSGLAL 
GVRRAVIiQLPGLTQVRWSRYSPEFKDPLIDKEYYRKPVEELTEE 
E KYVR E LKKTQL I KAAPAGKT S S VFED P VI S KFTNMMM I GGNKV 
LARSLM I QTLEAVKRKQ FEKYHAASAEEQATI ERNP YTI FHQAL 
KNCEPMIGLVPILKGGRFYQVPVPLPDRRRRFLAMKWMITECRD 

KKHQRTLMPEKLSHKLLEAFHNQGPVIKRKHDLHKMAEANRALA 
HYRWW 


5636 


2253 


1143 


LEDTICQHPPAEKKLYLYHRKLREVERNGIPRLPKDVFMDTHQG 
LTDVRAKVTGFSEGVVDSVKGGFSSFSQATHSAAGAVVSKPREI 
AS LIRNKFGSADNI PNLKDSLEEGQVDDAGKALGVI SNFQSS PK 
YGSEEDCSSATS3SVGANSTTGGIAVGASSSKTNTLDMQSSGFD 
ALLHEIQEIRETQARLEESFETLKEHYQRDYSLIMQTLQEERYR 
CERLEEQLNDLTELHQNEILNLKQELASMEEKIAYQSYERARDI 
QEALEACQTRI S KMELQQQQQQWQLEGLENATARNLLGKLIN I 
LIJVVMAVLLVFVSTVANCVVPLMKTRNRTFSTLFLWFIAFLWK 
HWDALFS YVERFFSS PR 


5637 


94 8 


2532 


MS FCGARANAKMMAAYNGGTS AAAAGHHHHHHHHLPHLP P PHLH " 
HHHHPQHHLHPGS AAAVHP VQOHTS S AAAAAAAAAAAAAMLNPG 
QQQPYFPS PAPGQAPGPAAAAPAQVQAAAAATVKAHHHQHSHHP 
QQQLDIEPDRPIGYGAFGVWSVTDPRDGKRVALK3QYPNVFQNL 
VS CKR VFRE L KMLCFF KHDNVLS ALD I LQ P PH I D YFEB I YWTE 
LMQSDLHK 1 1 VSPQPLS SDHVKVFLYQILRGLKYLHSAG I LHRD 
I KPGNLLVNSNCVLKI CDFGLARVEELDESRHMTQE WTQ YYRA 
PEILMGSRHYSNAIDIWSVGCIFAELLGRRILFQAQSPlQQIiDL 
I TDLLGTPS LEAMRTACEGAKAH I LRGPHKQPS LPVLYTLS S QA 
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(A=Alanine, CaCysteine, D=Aspartic Acid, E- 

H=Histidine, I«=Isoleucine, K*Lysine, 
L^Leucine, M«Methionine , N-Asparagine , 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








THFAVHLLCRML.VFDPYKR T^AKTlATvAHPYT nFHRT RVTTTPMPK' 
CCFSTSTGRVYTSDFEPVTNPKFDDTFEKNLSSVRQVKEIIHQF 
ILEQQKGNRVPLCINPQSAAFKSFISSTVAQPSEMPPSPLVWE 


5638 


125 


1155 


DRKMS ELDQLRQ E AEQL KNQI RDAR KACADATLS Q I TNN I D PVG 
RIQMRTRRTLRGHLAKI YAMHWGTDSRLLVS ASQDGKL I I WDSY 
TTNKVHAI PLRSS WVMTCAYAP S GN YVACGGLDNI CS I YNL KTR 
EGNVRVSRELAGHTG YLS CCRFLDDNQI VTSSGDTTCALWDI ET 
GQQTTTFTGHTGDVMSLSLAPDTRLFVSGACDASAKLWDVREGM 

PROTl?Tf3WT?QnT1Jl-TPPPD>JrTMZP-_ r TY2CrinZi'Pr , DT TTOT D XHAPT 
V-rty l r i <3i\CtiDiJ±riJ\.A.\^r r rN\ji\J\ri\L\j&UUn.l^KLjcUl^ 

MTYSHDNI I CGITS VS FS KSGRLLLAG YDBFNCNVWDALKADRA 
GVLAGHDNRVS CLG VTDDGMAVATGS WDS FLKI WN 


5639 


125- 


1155 


DRKMSELDQLRQEAEQLKNQIRDARKACADATLSQITNNIDPVG 
R I QMRTRRTLRGHLAKI YAMHWGTDS RLLVS ASQDGKL 1 1 WDSY 
TTNKVHAI PLRS S WVMTCAYAPSGNYVACGGLDNI CS I YNL KTR 
EGNVRVSRELAGHTGYLS CCRFLDDNQ I VTS SGDTTCALWD I ET 

r3nOT"T 1 'PP , TY3WTY3m7MQT.QT.&DTTP13T V\TQ(~2 B PT\X C R VT umnacnM 
uwi I IV I V3xl xoU v Mo Jjo JxHtriJ Xi\XjP Voo/\UL/AoAlUjnUVKikif'j 

CRQTFTGHESDINAICFFPNGNAFATGSDDATCRIjFDLRADQEL 

MTYS HDNI ICGITSVSFSKS GRLLLAG YDD FNCNVWDAL KAD RA 

rz\rT , n nunwuUQP t ./2\rT , 'nrviM a \i zvrn c urvc x?t vtuyt 
ovjjAuniJKKVoLJjuv x UiAil^VMlusnL>oriiA.xnM 


5640 


280 


1092 


QCONKKTMLSHNTMMKQRKQQATAIMKEVHGNDVDGMDI^KKVS 
I PRDI MLEELSHLSNRGARLFKMRQRRSDKYTFENFQYQSRAQI 
NHS I AMQNGKVDGS NLEGG S QQAPLT P PNT P DPRS P PN PDN IAP 
G YS GPLKEIPPEK FNTT AVP KYYQ S P WE QAI SNDPELLEAL Y P K 
LFKPEGKAELPDYRSFNRVATPFGGFEKASRMVKFKVPDFELLL 
IiTDPRFMSFVNPLSGRRSFNRTPKGWISENIPIVITTEPTDDTT 
VPESEDL 


JOll 


Z / 




rpUNrwrnuvT t QianMnvT rngm PTPurt t upt nroT^vr tor — 
L.K tiri L.N lj u V ]S.L>Lii3 N y I^JUjr Ar nijr 1 r rtUJ^HFJjDbbiyKLilQA 

E 1 1 LS DNSS ILVLENNFL FKVKSKQ F IHLIAKKFYI S IT I VS AS 

NGESFVLSMIVTG 


5642 


199 


1247 


IT P CRM D FLVLFL F YLAS VLMGLVL I CVCSKTHSLKGLARGGAQ 
I FS C 1 1 PECLQRAMHGLLHYLFHTRNHT F I VLHLVLQGWVYTE Y 
TWEVFGYCQELEIiSLHYLLLPYLLLGVNLFFFTLTCGTNPGIIT 
KANE L L FLHVYE FD EVM F P KNVRCS TCDLRKP ARS KHCS VCNW C 

\7HT?P r nWUn\nj\7MMr , TnA1iJTJTDV , T7T TVT7T TT.TI'V O 7\ 7\ m m TT fC? ""PTC 
VrlKr unntVH viifllUxwwflli IX I r xjJL X Vxjx l_)xA_sAAi VAlVb 1 I r 

L VHL WMS DL YQET YI DDLGHLHVMDTVFL I Q YL FLTFPR I VFM 
LG F VWLS FLLGG YLLFVL YIjAATNQTTNE WYRGDWAWCQR C P L 
VAWPPSAEPQVHRNIHSHGLRSNLOE I FLPAFPCHERKKQE 


5643 


1 


847 


PSGGVRDVEraGPGSRAARGPRVVMHRRGVGAGAIAKKKIAEAK 
YKERGTVLAEDQLAQMSKQLDMFKTNLEEFASKHKQEIRKNPEF 
RVQFQDMCATIGVDPLASGKGFWSEMLGVGDFYYELGVQIIEVC 
IiALKHRNGGLITLEELHQQVLKGRGKFAQDVSQDDLIRAIKKLK 

ALGTGFG T T P VRft T YT .TD^VPAFT .NMDWTWT .DT . A R IfMfiWTVQ 

EIKASLKWETERARQVI^HLLKEGIAWLDLQAPGEAHYWLPALF 
TDL YS Q E I TAEEAR EALP 


5644 


83 


1138 


PRRMGSWVQLITSVGVQQNHPGWTVAGQFQEKKRFTEEVIEYFQ 
KKVSPVHIiKILLTSDEAWKRFVRVAELPREEADALYEALKNLTP 
YVAI EDKDMQQKEQQFREWFLKEFPQ I RWK IQES I ERLRV1ANE 
I EKVHRGCVI ANWSGSTG I I*S VIGVMLAPFsTAGT.QT.Q TTAAHV 
GLGIASATAGIASSIVENTYTRSAELTASRLTATSTDQLEALRD 
ILHDITPNVLSFALDFDEATKMIANDVHTLRRSKATVGRPLIAW 

RYVP invvi:ti j rtrgapirivrkvarni^katsgvlvvix>v\^ 

VQDSLDIxHKGEKSESAEIxIxRQWAQELEEmiNELTHIHQSLKAG 


5645 


537 


799 


vqsvrdlkrlsptdppgdsgnrdvtredpvtgplnsassqvptl 
ylclqnsllghssvedaratmelyqisqrirarrglprlavsd 


5646 


3745 


3328 


aeqygtsphllptmllssclppanvttkaatppplvlslttadp 
agkpapcrvtltllras I patkrasflss fi kmffbeleyilgf 
lsllkfhvhvsvysaichfqkegtgnsrsftctpelfprlqthl 

RAEGGAQ 


5647 


288 


800 


GV I mats els cevs eencerreafwaewkdltlstr p e egcslh 
eedtqrhetyhqqgqcqvlvqrspwlmmrmgilgrglqeyqlpy 
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<A=Alanine, OCysteine, D=Aspartic Acid, E- 
Glutamic Acid, ^Phenylalanine, G-Glycine, 
H=Histidine, I^Isoleucine , K-Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q-Glutamine, R=Arginine, 
S-Serine, T*=Threonine , V=:Valine, 
{•^Tryptophan, Y=Tyrosine, X= Unknown, **Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








QRVLPLPIFTPAKMGATKEEREDTPIQLQELIALETALGGQCVD 
RQE VAE I TKQLPPWPVSKPGALRRSLSRSMSQEAQRG 


5648 


7 


1518 


VLSELCGRHEAJuREVGAEWPPPTCSPNICSGLQQAGNTDWSLTM 
AP Q S L P S S RMAPLGMLLGLLMAACFTFCLS HQNL KE FALTN P E K 
SSTKETBRKETKAEEELDAEVLEVFHPTHEWQALQPGQAVPAGS 
HVRLNLQTGEREAKLQYEDKFRNNLKGKRLDINTNTYTSQDLKS 
AliAKFKEGAEMESSKEDKARQAEVKRLFRPIEELKKDFDELNW 
IETDMQIMVRLINKFNSSSSSLEEKIAALFDIiEYYVHQMDNAQD 
LLS FGGLQ WINGLNSTEPLVKE YAAFVLGAAFSSNPKVQVEAI 
BGGALQKLLVI LATEQPLTAKKKVLFALCS LLRHFP YAQRQFLK 
LGGLQ VLR TLVQEKGTE VLAVR WTLL YDL VTE KM F AE E EAELT 
QEMSPEKLQQYRQVHLLPGLWEQGWCEITVHLLALPEHDAREKV 
LQ T LG VLLTT CRDR YRQD PQLG RTLAS LQAE YQ VLAS L ELQDGE 
DEGYFQELLGSVNSLLKELR 


5649 


1172 


3006 


M IjQEQLDAINE EIRMIQEEKES TE LRAEE I E TR VTS GSMEALNL 
KQLRKRGSIPTSLTDLSLASASPPLSGRSTPKLTSRSAAQDLDR 
MGVMTLPSDLRKHRRKLLS PVSREENREDKATI KCETSPPSS PR 
TLRLEKLGHPALSQEEGKS ALEDQGSNPS S SNS SQDS LHKGAKR 
KGIKSSIGRLFGKKEKGRLIQLSRDGATGHVLLTDSEFSMQEPM 
VPAKLGTQAEKDRRLKKKHQLLEDARRKGMPFAQWDGPTVVSWL 
ELWVGMPAWYVAACRANVKSGAIMSALSDTEIQREIGISNALHR 
LKLRLAIQEMVSLTSPSAPPTSRTSSGNVWVTHEEMETLETSTK 
TDSEEGSWAQTLAYGDMNHEWIGNEWLPSLGLPQYRSYFMECIiV 
DARMLDHLTKKDLRVHLKMVDSFHRTSLQYGIMCLKRLNYDRKE 
LEKRREESQHEIKDVLVWTNDQWHWVQSIGLRDYAGNLHESGV 
HGALLALDENFDHNTLALILQIPTQNTQARQVMEREFIWIjLALG 
TDRKLDDGDD KVFRRAP S WRKRFR PREHHGRGGMLS AS AETLP A 
GFRVSTLGTLQPPPAPPKKIMPEAHSHYLYGHMLSAFRD 


5650 


1172 


3006 


MLQEQLDA1NEEIRMIQEEKESTELRAEEIETRVTSGSMEALNL " 

KQLRKRGSIPTSLTDLSLASASPPLSGRSTPKLTSRSAAQDLDR 

MGVMTLPSDLRKHRRKLLSPVSREENREDKATI KCETSPPSS PR 

TLRLEKLGHPALSQEEGKSALEDQGSNPSSSNSSQDSLHKGAKR 

KGIKSSIGRLFGKKEKGRLIQLSRDGATGHVLLTDSEFSMQEPM 

VPAKLGTQAEKDRRLKKKHQLLEDARRKGMPFAQWDGPTWSWL 

ELWVGM PAW YVAACRANVKSGAI MS ALSDTE IQRE I GI SNALHR 

LKLRLAIQEMVSLTSPSAPPTSRTSSGNVWVTHEEMETLETSTK 

TDSEEGSWAQTLAYGDMNHEWIGNEWLPSLGLPQYRSYFMECLV 

DARMIiDHLTKKDLRVHLKMVDSFHRTSLQYGIMCLKRLNYDRKE 

LEKRREESQHEIKDVLVWTNDQWHWVQSIGLRDYAGNLHESGV 

HGALLALDENFDHNTLALILQIPTQOTQARQVMEREFNNLLALG 

TDRKLDDGDDKVFRRAPSWRKRFRPREHHGRGGMLSASAETLPA 

GFRVSTLGTLQPPPAPPKKIMPEAHSHYLYGHMLSAFRD 


5651 


646 


1869 


ARQGQRQPWG*EARAltGPASESPRV*EGSGWEGPASP*TPGSTL 
AWGEGAG I R * ASGLTAAGAAS AAAA/ P P PTRGG PAPAGCGRAPP 
WPAPLRVPTHGRAPAPRSRAAPRAPALiSHGTAAAALS PASPAGP 
AD P * L PGKS S QS PPRG * R WGR S R S APAP AH P EH P AP AGS AS ASQ 
QTPGWPGSCCLAQGWQAEPU3APGAEDG\PVPPQRGFPLGTLGS 
PAGS WAGLAG YG* AGAPGTQATAPRAAGQT P VAAAPNCRV*GSA 
PAIiHRAPAAAD PGS PLQAP PRAWAS P AAAG PGLS S SD YCGGLGA 
GWRAGISPELLGAAGLSDNWARCPGPGPAE*GGQPGCRTXPASA 
CMP S P P VEGSLGLSRKGHGDLPSQAR * GWHE CRRARHL VPLPRL 
LGPRGRTGRPSSPS 


5652 


735 


343 


HHKKYQHIHQKSFSCPEPACGKSFNFKKHLKEHMKLHSDTRDYI 
CE FCARS FRTS SNL V I HRR I HTGE KPLQCE I CG FTCRQKAS LNW 
HQRKHAETVAAIiRFPCEFCGKRFEKPDSVAAHRSKSHPALLLA 


5653 


66 


1401 


RGRLQS RGRLTLGLVLLLLD I LGARQHGQR VSHGWKGGFLTAPL 
CF PQ P CQ PGTRRGRRRS L KEATE PQLAMAEE FVTLKD VGMDFTL 
GDWBQLGLEQGDTFWDTALDNOQDLFLLDPPRPNLTSHPDGSBD 
LEP LAGGS PEATS PD VTETKNSPLMED FFEEGFSQEI /SRDVT Q 
GWLLELQFRRSLYRGHLVR* FARRSRKSSEV* YCHQRGKSHGMQ 



356 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G*Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q*=Glutamine , -R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *sStop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ES * I KERTQS CVHRFHGRRFHG \ DNVS EKTLtPAkS ICS YRGEFP 
SYS DHS QQD S VQ EG EK P YQCSE CGKS FSG S YRLTQHW I THTRE K 
PTVHQECEOGFDRKASHSGYPKTHTGYKFYVCNEYGTPFSQSTY 
LWHQKTHAGEKPCKSQDSDHPPSHDTQSGEHQKTHTDSKSYNCN 
ECGKAFTR1FHLTRHQKIHTRKRYECSKCQATFNLRKHLIQHQK 
THAANV 


5654 


3 


598 


TLPLFPGRRFRGWRRCGAVAARKNSTGGNVSINQRRDSVRMSAL 
NWKPFVYGGLASITAECGTFPIDLTKTRFQIQGQTNDAKFKEII 
YRGMLRALVR I GREEGLKAL YS G * VGLHAFLCHCSL FHKG I DFR 
PRLHRSQVKSLRCV* KEQIA* * /MFSLLISTLISKYIYYAADVL 
EKLFYYIQVQTDNNXKI CLFKNI 


5655 


2 


867 


RP PG I RAPRQLHPAAGRRPDASARPRFRPTVLLHDPFQLS FPPP 
PLSYPSVFPAVARVLPQRSGDYRAAGMPQLSGGGGGGGGDPELC 
ATDEMI PFKDEGDPQ\REKI FAE I VNPEEEGDLADI KSSLVNES 
EI I PASNGHEVARQAQTSQEPYHDKAREHPDDGKHPDGGLYNKG 
PSYSSYSGYIMMPNMNNDPYMSNGSLSPPIPRTSNKVPWQPSH 
AVHPLTPLITYSDEHFS PGSHPSHI PSDVNSKQGMSRKP PAPDI 
PT F YPLS PGGGGQ I TP PLG WQGQ P 


5656 


228 


1066 


PRRVPPLPE FASG PGAAFFHSGRLQRS LTKDSAGCFSQCRS RAM 
LVLRSGLTKALASRTLAPQVCSSFATGPRQYDGTFYEFRTYYLK 
PSNMNAFMENLKKNIHLRTSYSELVGFWSVEFGGRTNKVFHIWK 
YDNFPHRAEVRKALANCKEWQEQS 1 1 PNIARIDKQETE ITYLI P 
WSKLQKPPKEGVYELAVFQMKPGGPALWGDAFERAINAHVNLGY 
TKWGVFHTEYGELNRVHVLWWNESADSRAAVRHKSHEDP ISWG 
GVRESVNYL\VSQQNM 


5657 


105 


1052 


GQRLQSPRVQMPVQPPSKDTEEMEAEGDSAAEMNGEEEESEEER 
SGSQTESEEESSEMDDEDYERRRSECVSEMLDLEKQFSELKEKL 
FRERLSQLRLRLEEVGAERAPEYTEPLGGLQRSLKIRIQVAGIY 
KGFCLDVIRNKYECELQGAKQHLESEKLLLYDTLQGELQERIQR 
LEEDRQSLDLSSEWWDDKLHARGSSRSWDSLPPSKRKKAPLVSG 
PYIVYMLQEIDILEDWTAIKKARAAVSPQKRKSD\DLDPAVHSQ 
GDPQSSWHCTQDSRLPPADRRTHRPLRVCPARLLWCCWALPLHL 
ALVWTPPL 


5658 


2346 


3541 


TERRVYNPWPEPDPD\CIQEDPWNLPNSIKTLVDNIQRYVEDGK 
NQLLLALLKCTDTELQLRRDAI PCQALVAAVCTFS EQLLAALGY 
R YNNNGE YEES SRDASRKWLEQVAATG VLLHCQS LLS PATVKEE 
RTMLED I WVTLSELDNVTFS FKQLDENYVANTNVFYHIEGSRQA 
LKVIFYLDSYHFSKLPSRLEGGASLRLHTALFTKVLENVEGLPS 
PGSQAAEDLQQDINAQSLEKVQQYYRKLRAFYLERSNLPTDAST 
TAVKI DQLIRP INALDELCRLMKSFVHPKPGAAGSVGAGLI PIS 
SELCYRLGACQMVMCGTGMQRSTLSVSLEQAAILARSHGLLPKC 
I MQATD I MRKQG PR VE I LAKNLR VKDQM PQG APRL YRLCQPKMN 
GDL 


5659 


2 


696 


WKRSGEVS PKGELGAWRGNSGRPKI IGRAAEAENEDRTLGRLLP 
GNERS Q PRS PLRLLAPQLKAEAAADKGLAP VPPP FS SGHSG PC \ 
EREGEGQRGRGRSRRGAHLELKPSPGLRAGAPTDRGRGGPAEVA 
AAGGRRMVQKESQATLEERESELSSNPAASAGASLEPPAAPAPG 
EDNP AGAGG \ AAVAG AAGGARRFLCG WEG FYGRP WVMEQR KEL 
FRRLQKWELNTYL 


5660 


229 


853 


PVTMWAFSELPMPLLINLIVSLLGFVATVTLIPAFRGHFIAARL 
CGQDLNKTSRQQIPESQGVISGAVFLI ILFCFI PFPFLNCFVKE 
QRKAFPHHEFVALIGALLAICCMIFLGFADDVLNLRWRHKLLLP 
TAASLPLLMVYFTNFGNTTIWPKPFRPILGLHLDLGR*SYHCC 
PYGTYFREPFLVLHILLQVFLFCLCVFPDPFW 


5661 


2 


473 


LNLYPSPCGG I PKLPGLPREAAAALGAS FLAEAPLPVTVRGS GL 
AGMAVTCDPKAFLS I CFVTLVFLQLPLAS ICQN*GTDSCASRGK 
ADFDVTGPHAPILAMAGGHVELQCQLFPNISAEDMELRWYRCQP 
S LAVHMHERGMDMDGEQKWQ YRGRT 


5662 


2 


1318 


LRKEGRCRRGSNRGVWAAPAEGLGGRGMLGVRCLLRSVRFCSSA 
PFPKHKPSAKLSVRDALGAQNASGERIKIOGWIRSVRSQKEVLF 
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Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
K=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V.Valine, 
W-Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LHVNDGS SLES LQWADSGIiDSRELTFGSS VE VQGQL I KS PS KR 
QNVELKAEKIKVIGNCDAXDFPIKYKERHPLEYLRQYPHFRCRT 
NVLGS ILRIRSEATAAIHSFFKDSGFVHIHTPI ITSNDSEGAGB 
LFQLEPSGKLKVPEENFFNVPAFLTVSGQLHLEVMSGAFTQVFT 
FGPTFRAENSQSRRHLAEFYMIEAEISFVDSIjQDLMQVIEELFK 

attmMvlskcpedvelchkfiapgqkdrl*hmlknnfliisyte 
avei lkqasqnftftpewgadlrtehekylvkhcgni pvfviny 

PLTLKPFYMRDNEDGPQELEGSVA*HSLGLMILLSIWIGQP 


5663 


119 


698 


PADIGRSTAK.TPGPPRSLEMDDPRYGMCPLKGASdCPGAERSLL 
VQSYFEKGPLTFRDVAIEFSLEEWQCLDSAQQGLYRKVMLENYR 
NLVFLGIALTKPDLITCLEQGKEPWNIKRHEMVAKPPVICSHFP 

QDLWAEQDIKDSFQEAILKKYGKYGHANFQLQKGCKSVDECKVH 
KEHDNKIjNQCLIPKKKK 


5664 
5665 


118 


572 


SLSMESNHKijGDGLSGTQKEAALRALVQRTGYSLVQENGQRKY~ 
GPPPGWDAAPPERGCEIFIGKLPRDLFEDELIPLCEKIGKIYEM 
RKMMD FNGNNRG YAFVT F SNK VEAKNA I KQLNNY E IRNGRLLG V 
CASVDNCRLFVGGIPKTKK 




347 


702 


WQHLIILLHCERrSPAMITSELPVLQDSTNETTAHSDAGSELE 
ETEVKGKRKRGRPGRPPSTNKKPRKSPGEKSRIEAGIRGAGRGR 
ANGHP QQNGEGE P VTL FE WKLG KS AMQRC 


5666 
5£6"7 


213 


54 0 


VSCLPTSCKMITLNNQDQPVPFNSSHPDEYKIAALVFYSCIFTT^ 
GLFVNITALWFSCTTKKRTTVTIYMMNVALVDLIFIMTLPFRM 
FY YAKDEWPFGE YFCQ I LGA 




1 


695 


HPLPSASLGLPSVSLGVSLCVRSAIiLEAWPMLPKRRRARVGSP 

SGDAASSTPPSTRFPGVAIYLVEPRMGRSRRAFLTGLARSKGFR 

VLDACSSEATHVVMEETSAEEAVSWQERRMAAAPPGCTPPALLD 

ISWLTESLGAGQPVPVECRHRLEVAGPSKGPLSPAWMPAYACQR 

PTPLTHHNTGLSEALEILAEAAGFEGSEGRLLTFCRAASVLKAL 
PSPVTTLSQLQ 


5668 


691 




CS FLFC I PDLFLQFLLGRKEEEAVLVGGEWS PSLDGLDPQADPQ 
VLVRTA 1 RCAQAQTGIDLSGCTKW 


5669 


407 


1 


DSGAPEGLSPLMSTQEGLSMHAHPQAYTPFIYLHARKRRGEIGD 

ADSRFNDRYAHKSAQLYFLYFVCWIFQDVYYFTIKEKNHFFFPK 

ARGAPTKYSGSPIGSPTTTPPTRPPSFNLHPAPHLIiASMQLQKL 
NSQ 


5670 


3 


373 


SSECLTMAW1PJULLPLLILCTVSVASYELAQPSSVSVSPGQTAK 
ITCSGDVLAKKYARWFQQKPGQAPVLVIYKDTBRPSGIPERFSG 
S TSGTTVT LTI S GAQ VEDRAD Y FCY SATDNFLWVF 


5671 


280 


524 


KFPPKKTPPHLGMESAITLWQFLLQLLLDQKHEHLICWTSNDGE 
FKLLKAKKVAKLWGLRKNKTNMNYDKLSRALRLLFMT 


5672 


2 


557 


FVPATPDPGVWLPPSRDPAMAKRSSLYIRIVEGKNLPAKDTTG^ 
SDP YC I VKVDNE P 1 1 RTATVWKTLCPFWGEEYQVHLPPT FHAVA 
FYVMDEDALSRDDVIGKVCLTRDTIASHPKGKFSLPSHTGLPSP 

WPPSHSETS PLCS VWS PAQGKP FLLS PEAGATFCTPGLCSAACS 
QAWLLLPLP 


5673 


327 


696 


I TVADQ I S HWSAGR I KNR TR I P E C I HS S AATTLAG PHTMEGE S V " 
KLSSQTLIQAGDDEKNQRTITVNPAHMGKAFKVMNELRSKQLLC 
DVMIVAEDVEIEAHRW1AACS P YFCAMFTGDMS 


5674 
5675 " 


17 


984 


GGGSMEGESTSAVLSGFVLGAlAFQHLNTDSDTEGFLLGEVkGE 
AKNS I TDSQMDDVE WYT IDI QKYI P CYOLFS FYNS SGPVNPn a 

LKKILSNVKKNWGWYKFRRHSDQIMTFRERLLHKNLQEHFSNQ 
DL VFLLLTP S 1 1 T E S CS THRLEHS LYKP Q KGLiFHR VPL WANLG 
M S EQLG YKTVSGS CMS TG FS RAVQTHS S KF FEEDGS L KE VHKI N 
EMYASLQEELKSICKKVEDSEQAVDKLVKDVNRLKREIEKRRGA 
QIQAAREKNIQKDPQENIFLCQALRTFFPNSEFLHSCVMSLKID 
MFLKVAVTTTTISM 




80 


753 


EGSRRGPTRLARLSARAGRLHFPPGFSSRLIHFRGVSECRRPPG 
KSGVP VSAPGSDGKWWEERPGMFS LMAS CCGWFKRWRE P VRKVT 
LLMVGLDNAG KTATAKG IQGE YP EDVAP TVG FSKI NLRQGKFE V 
riFDLGGGIRIRGIWKNYYAESYGVIFWDSSDEERMEETKEAM 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
fUHistidine, I=Isoleucine, KsLysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, ToThreonine, V^Valine, 
W«Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 

\spossible nucleotide insertion) 








SEMLRHPRISGKPILVLANKQDKEGALGEADVIECLSLEKLVNE 
HKCL 


5676 


2 


930 


FVSS PPPRP VQPARPGGFGLSGRRSLLCQVASTPAHVGVMRS PV 
RDlJU^NDGEESTDRTPLLPGAPRAEAAPVCCSARYNIiAILAFFG 
FF I VYALRVNIiS VALVDMVDSNTTLEDNRTS KACPEHSAP I KVH 
HNQTG KKYQ WDAETOG W I LGS FFYG YI I TQ I PGGYVAS KIGGKM 

T T /-< EV TT /*^rn ■» T rr T*T PTD T T\ T\T"»T (~*\ ft -9 P»T T17T Ti A T T7>/~" T OP/^1JTOT4H 

LLGr GIIajI AVLilXjr 1 ir 1 AAUlAavtsWij J. VijKALJlljijtylLtsv lr PA 

MHAMWSS WAP PLERS KLLS IS YAGAQLGTVI S LPLSG 1 1 CYYMN 
WTYVFYFFGTIGIFWFLLWIWLVSDTPQKHKRISHYEKEYILSS 
L 


5677 


1 


1028 


PPRDG FLELRRLSVP LCSGPCPLTS LSRQGERSGGHLVAAARAA 
VTAETHPL r IjLiAPLiAVCQS VK5 PAACQ VKPRPRAVALPAAlJGGP 
GRSLPGLTAATMSSFSESALEKKLSELSNSQQSVQTLSLWLIHH 
RKHAG P I VS VWHRELRKAKSNRKLTFL YLAND VIQNS KRKGPE F 
TREFE S VLVDAFSHVARE ADEGCKKPLE RLLN I WQERS VYGGE F 
I QQLKLSMEDS KS P P P KATEEKKS LKRT FQQ I QE EEDDDYPGS Y • 

DVS LLEKI TDKEAAERLS KTVDEACLRNRGPGTS 


5678 


3 


593 


SSSPPSSTPSLPLPFYLLLGQLRLQLLWGTAHLSGAGEAAPCPG 
GSGRTAAPRTRADPAAU^ijMir^*^i^^*^H.f bLisVPRTr,! IEB 
SLAE FTEQFNQLHNRRNENLQLGPLGRDPPQECSTFS PTDSGEE 
PGQLS PGVQ FQRRQNQRRFSME VRASGALPRQVAGCTHKGVHRR 
AAALQ PDFD VSKRLS LPMDI 


5679 


2 


623 


LNSRVDDFVAVPGAIMDEDYYGSAAEWGDEADGGQQEDDSGEGE 
DDAEVQQECLHKFSTRDYIMEPS I FNTLKRYFQAGGSPENVIQL 
LSENYTAVAQTVNLLAEWLIQTGVEPVQVQETVENHLKSLLIKH 
FDPRKADoIr IhriVj&IrAWJj&UWlArlA I wKJJLr X lU-iAJSAliPIJCLi 
MLNFTVKVGR V1»E LRRKVFMNVY FWLL VCFL 


5680 


258 


592 


RRLTSTSEKLQNRNSHTPLESLIHPQPSYKGFGIMFGKKKKKIE 
ISGPSNFEHRVHTGFDPQEQKFTGLPQQWHSLLADTANRPKPMV 
DPSCITPIQLAPMKTIVRGNKPC 


5681 


45 


869 


LiL CAKTLG V RTK.E o QAEG * N Kovj 1 N W H^AbD P R FCPSr CWMRSA 
RQTRPQRLRKEAARPPTPGSCPGGTGMDGKKCSVWMFLPLVFTL 
FTSAGLWIVYFIAVEDDKILPLNSAERKPGVKHAPYISIAGDDP 
PAS CVFS QVMNMAAFLALVVAVLR F I QLKPKVLNP WLN I SG LVA 
LCLAS FGNTTLLGNFQLTNDEE IHNVGTSLT FGFGTLTCW I QAAL 
TL KVN I KNEGRRVG I PRVI LS AS I TLCVGPLLHPHGPKHPHVCS 

Uo P V \j P lati v Xi 


5682 


39 


622 


PSRSCLGTMRKWRHREVNLPEVTQQDAVCPAPI PS PGLSAQTGL 
QK I WG T I HCQ VCPG APAWPG S P WHEEMGLLLLVPLL LLPGS YGL 
PFYNG FYYSNSANDQNLGNGHGKDLLNGVKLWETPEETLFTYQ 
GAS V I L P CR YRYE P ALVS PRR VRVKW W KLS ENGAP E KDVLVA I G 
LRHRS FGD YQG R VHLRQD 


5683 


BS 


778 


GSCGATALI TRCLAWS VLI S RLAMATYT CI TCRVAFRDADMQRA 
HYKTD WHR YNLRRKVASMAP VTAEG FQ E RVRAQRAVAEEE S KGS 
ATYCTVCSKKFAS FNAYENHLKSRRHVELEKKAVQAVNRKVEMM 
NEKNLEKGLGVDS VDKDAMNAAIQQAI KAQPSMS P KKAP PAPAK 
EARJ^TVVAVGTGGRGTHDRDPSEKPPRLQWFEG^AKKIiAKHSEDD 
SEDEEHDLC 


5684 


195 


677 


twcfrgyi^prVImXaldeppyltvgtdvsakyrgafceakikt 

AKRL VKVKVTFRHDS S TVE VQDDH I KG P LKVGA I VE VKNLDG AY 
QEAVINKLTDASWYTWFDDGDEKTLRRSSLCLKGERHFAESET 
LDQL PLTNP EH FGTP V I GKKTNRGRR YE 


5685 


779 


1262 


LLLQQPWHCFLLFPPFRFSHHMIPGPPGPHTTGIPHPAIVTPQ 
VKQraPHTDSDLMHVKPQHEQRKEQEPKRPHIKKPLNAFMLYMK 
EMRANWAECTLKES AAINQ I LGRRWHALSREEQAKYYELARKE 
RQLHMQLYPGWSARBNYVS PS S I P VALHS 




128 


1181 


CTWWQVN I TLLD INDNHPTWKDAP Y Y I NLVEMTPPDSD VTTWA 
VDPDLGENGTLVYSIQPPNKFYSLNSTTGKIRTTHAMLDRENPD 
PHSAELMRK I WS VTDCGR PPLKATSS ATVFVNLLDLNDNDPT F 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Hishidine Ialsolpuninp KmT.vb S n*» 
L=Leucine, M^Methionine, N=Asparagine , 
P=Proline, Q-Glutamine , R=Arginine, 
S-Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\spossible nucleotide insertion) 








QNLPFVAE VLEGI PAGVS I YQWAIDLDEGLNGLVSYRMPVGMP 
DMnPT.TNCQ QnuwrTTPT nDPDTirvm dwa cnnfTDTyccT 

KilUf i-iXINoo oVj v will ct jjUrCCrCJ. X ^JjK V VAoUny 1 1 IwO 1 

STLT I HVLDVNDETPTFFPAVYNVS VSEDVPR \GSGWSG * AARN 
NDVGLNAELSYFITGGNVDGKFSVGYRDAWRTWGLDRETTAA 
YMLILEAI DNGPVGKRHTGTATVFVTVLDVNDKRPI ILQSS YV 


DOO / 


J. f 




H 1VD O A D DTV2 / ODD /DDD1 DDT / DP!31 & / AD IV CCrTVDDT ORPDMl 

QGDGGAAAVGH VLWP AVG PVR VNPGLQTP VPRP ELL PG P \ S S S 
LHSDSSYPPDAGLSDDEEPPDASLPPDPPPLTVP/ADA/PMPVT 
SG CRM P S TS AS E / AAGGQGACTHAXGS ETPP PAS P QTS E PAP S P 
LPPHLTGGPGMYSSEAKLPNSFSCLGLAGTGAGI*GTASAHGTG 
PPVLPHVCTPSLANPQP\AVGPEASSLPLGVSGIGMSA/SAPIS 
SSPFVAIGSCWLRGIPPPGSGFLCPGRAFGPVPITTHGQEGQGP 
VLDI 


5688 


1 


420 


LTKWDLFGNCYRLLKTGIEHGAMPEQVGVYWYS/CLYDSRKLFF 
*SHMIIRSLL*KVIDDSLGQLPLLRELLL**LNVIDRCIILAYV 
LRVEKTFAITYLKNFTVKVDFSLLGEIPLISMAAILKLWIMKID 
DGYIPAVF 


5689 


1504 


3 


HELSGKHISMVSGNTCNWHPGGHS PGGGGQGE ITSKDRGE I PAL 
1WA/RKPIGTWTATKPTHRAG*GGAEEYQPPPQPCEGPRSTSRG 
GEG*GHAVGPGREIGKEGSLPFLGPKALGF*SASCQRAFEGGAH 
GSTARKPAPATPGTRHPRTMETREVAQGWPAGPRSQFWDQHPHS 
PGEHRPSG\SPLPACPPRAWPKAGAVASATGTG\PQLPGSRGKQ 
KLPRTRE PPLLQAGWAVRKP PWSEAKEGLGQAGRPSGMDS SAS \ 
PQTPGGRGSLEWGLPLYLGPHHDVK*RSDRLG*PP*GGQGGGGH 
GAP S T PG PGGEAW * LPQQTSR P KPG PQ A Y * GE \G S PG LQC P CS K 
EL*RVPPGSLGPSTQCMYEPTDKHS\GGADAQLEVSTAGSRSTF 
GQELKGPLDAGRLWPGAPSASSSHR * GG * ERARAGAGHRGST* A 
SSKIEQGRPRPGPTSDALADVEGGAES/GPHPWPLPGTLPNR/P 
GS PPPA* AS AGRKGTVSTLGGGLL 


5690 


1424 


58 


PS PFAGVC^AAPAPLPLLALARRDRRP CS PGAEAAPWQTGGPAI D 
GAWRTS VSALRRGATG /APCS PGAEAAPWQTGG PAI DG\ DGELP 
* VRS EEAPRGCGAEGGGPGSGPVRR PGAGRGAHAGOGRQQDPEP 
DGLRHRQHGAAS HARHRLQRLRPGHHQNRHVRRDPQAP PGGPAP 
(jhAftAliPJERTKu VAE f FAWAHAGSDAWRAGR*SQRT*ERARPRH 
PTFQGRAGS\GQPGYQPPNPHPGPSSPPAAP\GPRGA*GNPQLE 
KAPRSDRNPSQGLRTRIRRPETPDCGPPSPAGSSASASTFRCTS 
S LSLLGP / PGAHNLDTAPQDR* HGP*GDKRGAPGVAGEDPRPP+ 
GNFVR * LLLM P / G VA * RHGTS P FLGPS LG ENGGQWD S GNLFGTP 

onr nr ijvji onciACiiw i ni\nr nx\ \L/i\u!\UVjvniiiLijnwou 

SEMWGPYSAPRPGTVFLSSFLSPASEEH\PEGSSSFNTPFPPAG 
PEGDPGLNS PGLLP 


5691 


107 


550 


I SNDP S PG YN I EQMAKRGKKL VEL P YTVKGMD VS FS G I LS FI ED 
VAHRMLATGE CTPEDLCFSLQVMQ * KTGTES WG*RFY I VEQN* S 
GDAPLI FS P YLSLTGNCGFAMLVE I TERAMAH\ CGS PGGPSLWG 
GVGVYVLLESVPLSYS 


5692 • 


1193 


54 8 


TQAWTRAEKDRKGSVRALRLHLERGPPT* RGSHPL \QSVPCIQK 
PS IFSSYP I /GLPQSGGEPGPVGEQQPVRRPEQPSCGPASRMPL 
TS RS VP PGRGALPPDSLSTRKGLPR PSTAGHRVRESGHKVP VSQ 
RLNLPVMGATRSNLQPPRKVAVPGPTR*RDQDSKQDFSSKPLQS 
VPGLASTQQTLTPADSGPGTGGRDATRAGLPGVETMGNGVD 


5693 


1258 


1330 


ALTWPVRKGTTWWAQPHGCSNLVSRARLDLSSRPSQNTEPQAP 
*QAGPPSSLRPP\SRRR*APEWPKRATGSRCRGLSAPPWPWPAA 
RGE/PGSAPSHAP/PNSPRPSGTRHP/PGPSSRVLYSPSLPRNS 
PEAIVWRSSRFPLWFPLRCCFWVSGFKDPNPVLRFF 


5694 


3 


1338 


GS KEPARS LHRRGSGHKSS AG KWGS VTLS TAGALG + KQLHQ * WT 
QRCL\NNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 
SLAESGLSWFSESEEKAPKKLEYDSGSLKMEPGTSKWRRERPES 
CD DS S KGG ELKKP IS LGHPGS LKKG KT P P VAVTS PI THTAQ S AL 
KVAGKPEGKATDKGKLAVKNTGLQRSSSDAGRDRLSDAKKPPSG 
IARPSTSGSFGYKKPPPATGTATVMQTGGSATLSKIQKSSGIPV 
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L=Leucine, M=Methionine, N=Asparagine, 
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S^Serine, T=Threonine, V- Valine, 
W-Tryptophan, Y-Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=posaible nucleotide insertion) 








KPVNGRKTSLDVSNSAEPGFLAPGARSNIQYRSLPRPAKSSSMS 
VTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAPVNQTDREKEKAJCAKAVALDSDNISLKSIGSPESTPKNQASH 
PTATKLAELPPTPLRATAKSFVKPPSLANliDKVNSNSLDLPSSS 
DTTQCI 1 


5695 


3 


1338 


GS KE PARS LHRRG S GH KS S AGKWG S VTLS TAGALG * KQLHQ * WT 
QRClANNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 
SLAESGLSWFSESEEKAPKKLEYDSGSLKMEPGTSKWRRERPES 
CDDS S KGGELKKP I S LGHPGS LKKGKTPP VAVTS P I THTAQSAL 
KVAG KPEG KATDKG KI±AV KNTGLQRS S SDAG RDRLS DAKKPPSG 
I ARPS TSGS FG YKKP P PATGTATVMQTGGS ATLSKI QKS SGI P V 
KJPVNGRKTSLDVSNSAEPGFLAPGARSNIQYRSLPRPAKSSSMS 
VTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAP VNQTDRE KEKAKAKAVALDSDNI SLKS IGS PESTPKNQASH 
PTATKLAEL P PTPLRAT AKS F VKP PS LANLD KVNSNS LDL P S S S 
DTTQCI 


5696 


3 


1338 


GSKEPARSLHRRGSGHKSSAGKWGSVTLSTAGALG* KQLHQ+ WT ' 
QRCL\NNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 
SLAESGLSWFSESEEKAPKKLEYDSGSLKMEPGTSKWRRERPES 
CDDSSKGGELKKPISLGHPGSLKKGKTPPVAVTSPITHTAQSAL 
KVAGKPEGKATDKGKLAVKNTGLQRSSSDAGRDRLS DAKKPPSG 
IARPS TSGS FG YKKPP PATGTATVMQTGGSATLS KIQKS SG I PV 
KPVNGRKTSLDVSNSAEPGFLAPGARSNIQYRSLPRPAKSSSMS 
VTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAPVNQTDRE KEKAKAKAVALDS DNI S LKS IGS PESTPKNQASH 
PTATKLAELPPTPLPATAKSFVKPPSLANLDKVNSNSLDLPSSS 
DTTQCI 


5697 


1147 


47 


PSEALSPPACPSAPAPRRSIISRLFGTSPATEAAPPPPEPVPAA 
QGPATVQSVEDFVPDDRLDRSFLEDTTPARDEKKVGAKAAQQDS 
DSDG EALGGN PM VAGFQDD VD LEDQ P RG S P P L PAG P VPS QD I TL 
SSEEEAEVAAPTKGPAPAPQQCSEPETKWSSIPASKPRRGTAPT 
RTAAPP WPGGVS VRTG PEKRSSTRPPAEMEPGKGEQASSS ESD P 
EGP I AAQMLS FVMDDPDFESEGSDTQRRADDFP VRDDPSDVTDE 
DEGPAEPPPPPKLPLPAFRLKNDSDLFGLGLEEAGPKESSEEGK 
EGKTPS KENKKKKKKGKEEEEKAAKKKS KHKKS KDKEEGKEERR 
RRQQRPPRSRERTAA 


5698 


2 


666 


GAEAAE PQEDL P PLSQS SR FFQEQQKMNKSIiGPVS FKDVAVDFT 
QEEWQQLD PEQ KI TYRDVMLENYSNLVS VGYH 1 1 KPDVIS KLEQ 
GEEPWIVEGEFLLQSYPDEVWQTDDLIERIQEEENKPS.RQTVFI 
ETLI * R / ERGNV PGNTFDVETNPVPSRKIAYTHSLCNS CER \ GF 
NASS E Y I S S DGR YARMKADE CSGCGKSLLH I KLEKTHPGDQAYE 
FNQ 


5699 


2 


144B 


RVRQPPGLWVRRTVPAMQCPAGLSRVPGVAG/DPSLPSFRGPRD 
EAAHRGTI QTARHTRKLYVQGPASGPPLPRVSTQVAI * DEKPLA 
R PS / G RTN AP F PQGQ KPAG KAAPGPAAAGR VAMR \ PGH PGLLAS 
DSQRSSSKGSGWETPVPWS*AQPGWVSGLLLLGDPSGPGSL*RS 
TWLVGGARGPEGSG VRGSG WPSGCSDIGWALAGWNHS * HLDPNT 
WTQKWTGE/SPAPGEEG\VAPAPRGPTAEHGHCELTTESQYSNN 
VP I LFQNPSGALRS RRTEPAG WVP PTRHE * DDG*TAAPASGGAP 
VSTPTWAGTP/LNASLGPTDPQGKPGCRPPCALPKPAGPERSA* 
GGSLGCR/SMLPASSGPPPAPGPRRLAAGAaTSASARCPPAAAA 
GWQPRRPGFAGRAALPGPPHPPSS*RELGGLPGPGW*TLDPLPA 
HPAHP P GS AP P WGALGGWAAARAS LPWS PS LCLS FP AVT P VAG L 
FPPGRG 


5700 


923 


597 


NGH KGVWE INIY*RRSNI HKNS KS E S HLNQDHS FP P PT PNS ARS 
KLHSTGTAKNTGLPLSGAPRQRAVFSGRTICQEFSSCLQCAYLD 
E * CS IAS S L I KAILR VS VliS E 


5701 


59 


410 


IFEKICSDTQEFISPEINPQICSWLIFDKGAK/NHATCSKDSLFN 
KWSWKNWLSTCR*MRPGPYFTPYTKINSK*IK/DANIRCETVKL 
LEENTGENLHDTGLGNVFLDMTPKTQPTKQK 
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5702 


3 


1517 


ETFVDPSQCGGIPSDSPHPVITPSRASESSASSDGPHPVITPSR 
ASESSASSDGPHPVITPSRASESSASSDGLHPVITPSRASESSA 
SSDGPHPVITPSRASESSASSDGPHPVITPSRASESSASSDGLH 
PVITPSRASESSASSDGPHPVITPSWSPGSDVTLLAEALVTVTN 
IEVINCSITEIETTTSSIPGASDTDLIPTEGVKASSTSDPPALP 
DSTEAKPHITEVTASAETLSTAGTTESAAPHATVGTPLPTNSAT 
EREVTAPGATTLSGALVTVSRNPLEETSALSVETPSYVKVSGAA 
P VS I EAG S AVGKTTS FAGSSAS S Y S P S EAALKNFTP S ETLTMD I 
TTKGPFPTSRDPLPSVPPTTTNSSRGTNSTLAKITTSAKTTMKP 
PTATPTTARTRPTT\A+VQVKMEVSSSCG*VWLPRKTSLTPEWQ 
KG*CSSSTGNSTPTRLTSR5PYCVSGEANG/PSAAARHVPYAKR 
G CCP * PG PP PTDCS CVT VLRGTQKVPMKGSMS KPLTPDVATGPS 
LTS TG VY VWGGAS P VPRGVLGLTLAHVLCFS KE KT 


5703 


14 


1117 


HHKDSRSQGLPRTQECARPELRPLLCPRALWPVTRLSYRCPWQA 
P KAG I GTKAKPSES HLKLHPGWPS LDRQGE PATLGTGTGHCSDS 
RILRWHP * HTAAR* PRWRRLPSSHRWTRHK3VLRVQDKS * * VSL 
DPSCRPRFLRTC**YGMRSVASSSNPPPGWSGPGASVFPARPVS 
ALPTGPRCW+APRGRTRQPCGWPRLSSPHATADWGPGCPLSPSR 
GSWETAPGS * WCPWL+ AARWTGWRTASGAS AGLGRAADRPSAWA 
RRVAGLLPGQGLTVRR*H*TAGAPASVRSSQGATRSPAPGGDQC 
ACGRGPGSC*HPPPWPVSPSSPVPCPSGR*HLRGPLLSAARPRA 
AGWPRHSPHDTQTPEP 


5704 


23 


562 


GDYEFDSPYWDDISQAAKDLVTRLMEVEQDQRITAEEAISHEWI 
SGNAASDKNIKDGVCAQIEKNFARAKWKKAVRVTTLMKRLRAPE 
Q S S TAAAQ S AS ATDTAT PG AAGG ATAAAAS GATS APEGDAARAA 
KSDNVAPRRP*LPPQPQMEVPPQPLMAVSPQPPMEASLQPIjMGE 
SPQP 


5705 


23 


562 


GD YE FDSPYWDD I SQAAKDLVTRLME VEQDQR I TAEEAI SHEWI 
SGNAASDKNIKDGVCAQIEKNFARAKWKKAVRVTTLMKRLRAPE 
QS S TAAAQSASATDTATPGAAGGATAAAASGATS APEGDAARAA 
KSDNVAPRRP*LPPQPQMEVPPQPLMAVSPQPPMEASLQPLMGE 
SPQP 


5706 


1161 


610 


QLGRFXAQDTVAIRKVKEVFGTGAMRHWILFTHKED*GGQALD 
DYVANTDNCSLKDLVRECERRYCAFNNWGSVEEQRQQQAELLAV 
IERLGREREGSFHSNDLFLDAQLLQRTGAGACQEDYRQYQAKVE 
WQVEKHKQELRENESNWAYKALLRVKHLMLLHYE I FVFLLLCS I 
LFFIIFLF 


5707 


28 


609 


GSPAPTPGPRRRPGRGTPSPGTRHHQGRAEPEPDAPERAPLRR* 
MFAI QPGLAEGGQFLGDPP PGLCQPELQPDSNSNFMASAKDANE 
NWHGMPGRVEP I LRRSSSESPSDNQAFQAPGSPEEGVRS PPEGA 
EI PGAEPEKMGGAGTVCS PLE DNGYASSSLS I DSRSSSPEPACG 
TPRGPGPPDPLLPSVAQA 


5708 


44 


192B 


SFSWEETISPCFP KMP AE P W W LS P VS LGAAG W PGQPRP YLDLPA 
QAS VS R PHDRA + G EAVS LS LS SGD VCGHTDGGGAGS DPQAKPKP 
PRCPFTAMPSPRTKQKVRNKVCLLIAIRYSDIPSDVSKAP\GPA 
GNPHDRS STAA * LHRRAGAGS LCLSASLLPPSFS LG APGAPS P L 
RVSPASGGPRKEGRQGSGG*AGGGGP\ARTHADLPCVGFVCSPP 
LLK*SDSPVKQLPA\SGQGSGAGMPPVGSSDILRPRPTSVSGTG 
RAAG*CSWQPAACCTPRSQ*WAVARSPSRCSRW*RQSGR*RG*S 
SRRRRGP *AAGRSTPAVP * PCS*GGAGRRAYACRTGWGYAPSR* 
LEPSGPTSGSAL*TWASHSTGA* *SRLCGTAGTGPLCSQSSRS* 
AG*RCCCTAASPCGGSGPSHPGSPSAHCLSWSGGRTQPRAPSAH 
GRGRAMGSRCVCTCTGLPCPG I PLSGASPGGSGETGAGRSHTLK 
AARS RLS P RPGSGS RGS Y * S HNDNWGT WP APPSAGHLL VGG * NS 
QRTSSDH* YTGTRRPWAGPGTRCSTAPSRAAPPVSRCRPPPPPP 
PPRPPRLPAAAS/SGGASGSPAASCSCSCRAPAKPASS/GEAPA 
PPPRPEPPPPPARRP 


5709 


2 


2031 


I T LC P LP QTEKCLNWTEAAT P LG I YL KAR VE AGGLKE LE I S WG ' 
LHQI WRWGAWMRAGMGGCRCWGVMAPFAPR/NALS FLVNDCS 
L I HNNVCMAA VFVDRAGE WKLGGLD YMYS AQGNGGG P PR KG I PE 
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Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEQYDPPELADSSGRVVREKRSADMWRLGCLIWEVFNGPLPRAA 
ALRNPGKIPKTLVPHYCELVGANPKVRPNPARFLQNCRAPGGFM 
SNRFVETNLFLEEIQIKEPAEKQKFFQELSKSLDAFPEDFCRHK 
VLPQLLTAFEFGNAGAWLTPLFKVGKFLSAEEYQQKI I PVWK 
MFSSTDRAMRIRLLCX5MEQFIQYLDEPTVNTQIFPHVVHGFLDT 
NPAIREQTVKSMLLLAPKLNEANLNVELMKHFARLQAKDEQGPI 
RCNTTVCLG KI GS YLS AS TRHR VLTS AFSRATRD P FAPS R VAG V 
LGFAATHNLYSMNDCAQKILPVLCGLTVDPEKSVRDQAFKAIRS, 
FLS KLES VS ED P TQLE E VE KDVHAAS S PGMGGAAAS WAGWAVTG 
VS S LTS KL I RSHPTT APTETNI PQRPTPEGVPAPAPTP VPATPT 
TSGHWETQEEDKDTAEDSSTADRWDDEDWGSLEQEAESVLAQQD 
DWSTGGQVS RASQVS \TPTTNPPNPQS PTGAAGK\ RGLLGTGLA 
GAKLPGATS *RYTAGQRV 


5710 


1 


562 


I PGST I SCE VELMARMAKT IDSFTQNQTRLWI I DGLDACEQDK 
VLQMLDTVRVLFSKGPFIAIFASDPHIIIKAINQNLNSVPSGFK 
\ LNGHD YMRN I VHLP VFLNS RGL / RQ / LQENFS * LQQQM ETFHA 
QILQG YRKMLTEE FHRTALGR*QNLVARQPS IDG * DAIGFELYV 
CIAIQFNTNKDDAT 


5711 


1526 


1130 


RRHP FQWTTVTQEAFSHHDVAFTS TPVLFYPDS AQPF I VKS ESS 
SQ IAKAVLSQQRPSLFHECAFHFFS * SLQRHTINLDQGI F* LLM 
LSEERQHLFESS/IWTTPHNLK*/FEIHEHLGSHEGHWTLFFLL 
QIL 


5712 


3 


1391 


GRKLFQSLD I S ERLKFLLTLDCVDDTL I VLAEBHGCLD IIKELP 
ETVIDLLNKCLTFHPSKRPTPDELMKDKVFSEVSPLYTPFTKPA 
SLFSSSLRCADLTLPEDISQLCKDINNDYLAERSIEEVYYLWCL 
AGGDLEKELVNKE 1 1 RSKP P I CTL PNFLFEDGE S FGQGRDRS S / 
T FR * YHWD I WM PAKK * I ERCWGRS I L P I TLKMTSL I LP Y S NSN 
NELSAAATLPLIIREKDTEYQLNRIILFDRLLKAYPYK2CNQIWK 
EARVDIPPLMRGLTWAALLGVEGAIHAKYDAIDKDTPIPTDRQI 
EVDIPRCHQYDBLLSSPEGHAKFRRVLKAWVVSHPDLVYWQGLD 
S LCAP FL YLNFNNEALVYACMS AFI P KYL YNFFLKDNSHVI QE Y 
LTVFSQMIAFHDPEIiSNHLNEIGFIPDLYAIPWFLTMFTHVFPL 
HKI FHLW \ DTLLLGEFLFP I L YWE 


5713 


634 


284 


PVCAVPVDRWPVLPREDQEGQQL*AKLPRDFRR*FQILGPMEGH 
T ACRC SRRG AQ VQHL PRE D I RAAE * D PHLRE VW PGL PTS SATS P 
* RAVLTS P CSHLGSADAAS SHWLCGVS FH 


5714 


212 


613 


WGLGLGPTMSSLGGGSQDAGGSSSSSTNGSGGSGSSGPKAGAAD 
KS AWAAAAPAS VADDTPP PERRNKSG 1 1 S EPLNKS LRRSRPLS 
HYSSFGSSGGSGGGSMMGGESADKATAAAAAASLLANGHDLAAA 
MA 


5715 


131 


1979 


ESASQQKRSKCLILTLKLELSGSAPKKTSARPGSSLWLPPHSQE 
QTPPASKLQGGGGGLQTGWGLHPVPVTAASPLPRWCIjFGAVAKX 
GLPGP*LCPSGAA/GGLQRGPGLSPLGAAGKVSCIiHPPSMVENN 
DSTCHEHHEGILAARVTPVP\SGKPGRVLKPPGRVCRPPHPAAS 
PRPPGS/SDLDGPRPQMHLRAFPAAHGGPVNTPHGGEEKTFMSS 
Q I RRKETKPL* RKTPAG\NNYQSNSIPVSQS PQLTVDLLPSAGR 
TQ APSGRG DAG K PT PGHG \ LP KAS VI LTPNC P CS LAGGQ * P PGL 
YPKTPKQRRWRRPL/LLGPSQ*GSRQSTC*EV\GALGEPVRIPG 
L*?DLSCILSNGSKHRREGLSFPRSLGPGRRGPAGLQSLGCSPT 
PKNTACHS SGHVALQAGHDSARDVGSGHVALQAGHDS TQDVGRP 
VWRWIPLE *LGLSRETGQATRRGLVWIS PGRAAAACVACAQALE 
EGPLRLPGQDRGAQPCSHCPGRAAGQPEPGAGAPCRE/GG*DPT 
GLT/GVPGTDPKRGGRKPGQSGQETQGPTVWSGPESPLQPKP*E 
RQE/VGAGASSGVGLSRGRAGGPSSAWEVAAMLLLLRHGSHSEL 
TDLTEAQTSQH 


5716 


1711 


1370 


R VFS LLCEG PGHCY QGAVCREACAAAS PG LDS AAE PHRLCEHTD 
*LPK*GPGYIQHFHCDSNILCILYNISFNLFSYSF*GVARYAC* 
RCPLVL*SGFFTIIVGGYSCCMPLKT 


5717 


44 


1489 


LPTEALRESEWVSEYGKCGPRGLVPEGESTSPLPSSVDTEDSLD 
EGPGALVLESDLLLGQDLEFEEEEEEEEGDGNSDQLMGFERDSE 
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Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N»Asparagine , 
P^Proline, Q-Glutamine , R=Arginine, 
S-Serine, T-Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
X^possible nucleotide insertion) 








GDSLGARPGLPYGLSDDESGGGRALSAESEVEEPARGPGEARGE 
RPG P ACQL CGGPTGEG P CCGAGG PGGG PLL P P RLL YS CRLCTF V 
SHYSSHLKRHMO^SGEKPFRTORCPYASAQLVNLTRHTRTHTG 
EKPYRCPHCPFACSSLGNLRRHQRTHAGPPTPPCPTCGFRCCTP 
RPARPPSPTEQEGAVPRRPEDALLLPDLSLHVPPGGASFLPDCG 
Q\ CG VKGRAS AGLDQNHCQS / S LFPWTCRGCGQELEEGEGSRLG 
AAMCGRCMRGEAGGGASGGPQGPSDKGFACSLCPFATHYPNHLA 
RHMKTHSGEKPFRCARCPYASAHLDNLKRHQRVHTGEKPYKCPL 
CPYACGNLANLKRHGRIHSGDKPFRCSLCNYSCNQSMNLIRHM 


5718 


12 0 


284 


VAHALSLPAESYGNDVSMTHPQLPPTQLAWDLCRTCLPLSYNFT 
S**STADPLHL 


5719 


48 


428 


ELNNGPFQMPLCNGGNLAVTGSWADRSPLHEAASQGRLLALRTL 
LSQGYNVKAVTLDHVTPLHEACLGDHVACARTLLEAGANVNAIT 
I DG VT PLFNACSQGS PS CAELLLE YGAKAQP \ES CLPS P 


5720 


1 


1051 


LQAFRNAS EVPMVLVGTQDAI S AA\NPRVYRRTSRARKLS TDLK 

\rct\yye \TCGGTYGLQMWS vs fqdvaqkwal\rkkqq\lai 
GPCK\SLPN\SPSH\SAVSAASIPARAPINQGHE/SGGGSAFSD 
Y\SSSVPSTPSISQRELRIETIAASSTPTPIRKQSKRRSNIFTS 
RKGADP\DRE KKAAGCKVDS IGS GRAI PIKQG I LLKRSGKSLNK 
EWKKKYVTLCDKTGliLTYHPSLHDYMQNIHGKEIDLLRTTVKVPG 
KRLPRATPATAPGTSPRANGLSVERSNTQLGGGTGAPHSASSAS 
LHSERPLSSSAWAGPRPEGLHQRSCSVSSADQWSEATTSLPPGM 
QHPASG 


5721 


97 


492 


RHSS PCCSLRRTERSSNAAVST / TTVQQFKRFI ENYRRHI GCVA 
VFYAI AGGLFLERAYYYAFAAHHTG I TDTTRVGI I LSRGTAAS I 
S FMFS Y I LLTM CRNLI TFLRET FLNR YVPFDAAVD FHR L IAS T A 


5722 


88 


1043 


VALDVLAGS S PGGGMAGALLG PR VHG I RAVLRVARGGVQAPGAP 
GSLGVSHAAAPPARPQGAAQSPHRGRRHGGGGAGLPPPRSPRFP 
QESVPASTSTARGPRRVSRRLPPQHPGPRGRRRRPGAGVGAPRR 
GRARGQAGLLGRQGQGGRGAERERAALQARRGRRPGPEPDQSCG 
GRPRRAAAAPGRAPADPQPPAPRPAPAPDVRPPADAPAPAPAPA 
PP PPPHLGALTAGSGEERQSQ PRAETLRLGRGAPLP \ PRAERGG 
RPKQAEQQQ\ P KRPTPPARGPQSSGDPAMLPQRAGLRTGGLAGT 
KSSTREIPEMI 


5723 


88 


1043 


VALDVLAGS S PGGGMAGALLG PRVHG I RAVLRVARGGVQAPGAP 

gslgvshaaapparpqgaaqsphrgrrhggggaglppprsprfp 
qesvpaststargprrvsrrlppqhpgprgrrrrpgagvgaprr 
grargqagll<3rqgoggrgaereraalqarrgrrpgpepdqs cg 
grprraaaapgrapadpqppaprpapapdvrppadapapapapa 
?pppphlgaltagsgeerqsqpraetlrlgrgaplp\ praergg 
rpkqaeqqqNpkrptppargpqssgdpamlpqraglrtgglagt 
ksstreipemi 


5724 


3 


1841 


FTOEAPPAPLFDASASPLSPHRRAKSLDRRSTEPSVTPDLLKFK 
KGWLTKQYEDGQWKKHWFALADQSLRYYRDS VAEEAADLDGE I D 
LSACYDVTEYPVQRNYGFQIHTKEGEFTLSAMTSGIRRNWIQTI 
MKHVHPTTAPDVTSSLPEEKNKSSCSFETCPRPTEKQEAELGEP 
D P EQKRSRARE \ RRREGRS KTFDWAE FRP I QQALAQERVGGVG P 
ADTH\DPWRPEAEHGELERERARRREERRKRFGMLDATDGPGTE 
D AALRME VDRS PG L PM S DLKTHNVHVE I EQRWHQVE 7TPLRE E K 
QVPIAPVHLSSEDGGDRLSTHELTSLLEKELEQSQKEASDLLEQ 
NRLLQDQLRVALGREQSAREGYVLQATCERGFAAMEETHQKKIE 
DLQRQHQRELEKLREEKDRLLAEETAATI SAIEAMKNAHREEME 
RELEKSQRSQISSVNSDVEALRRQYLEELQSVQRELEVLSEQYS 
QKCLENAHLAQALEAERQALRQCQRENQELNAHNQELNNRLAAE 
ITRLRTLLTGDGGGEATGS PLAQGKDAYELEVPSGAR PCLTQLC 
TQ E PQGSAAWPLS YR WGGTDLRQQESGGPGRS KS P EGGEEQ 


5725 


3 


1049 


VNGHSEETSQSPNRTEPHDSDCS VDLG I S KSTEDLS PQKSGPVG 
SWKSHSITNMEIGGLKIYDILSDN\DLSSHLQPLK/FTSAVDG 
KN I VRS KAATLL YDQ PLQ VFTGSS S SSDL I SGTKAI FKFDSNHN 
PE /GAKYNKRPHKWAHNLHLKYMVLHS 1 1 SNTVAV\RS QRHFVA 
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LQTKS PNRPCQFSSSAPS/ VDQRAQ/ INQS YAKHSANMNFSNHN 
NVRANTAYHLHQRLGPARHGEMWAI S PNDRLI PAVTRSTIQRQS 
SVSSTASVNLGDPGSTRRAQIPEGDYLSYREFHSAGRTPPMMPG 
SQRPLSARTYSIDGPNASRPQSARPSINEIPERTMSVSDFNYSR 
TSP 


5726 


2 


486 


SRSLSMWWNSGLPASSHSSKLPVTVGFSGCVKRLRLHGRPLrGAP 
TRMAGVTPCILGPLEAGLFFPGSGGVITL/ESVGAGIPGPSRAG 
QGS PGGSGEG PPLSSPSQPLPADLPGATLPDVGLELEVRPLAVT 
GL I FHLGQAR TP P YLQLQ VTEKQ VLLRADDG 


5727 


21 


221 


RP I LI LKETRRLPWATGYAE VINAGKS THNEDQASCEVLTVKKK 
AGAVTSTPNRNS S KRRS SLPNGE 


5728 


2 


877 


GTRNGQFEPRRGRAWEGS AGGLRAPGAAAGGPGVQPRGSG/ LPG 
NAIRAGVNPGRGPASPFWDLSLPWDLWPPPTDHAPGAPDFPAVE 
GR\PWAGGRPPWPVSGVLGSRVCGPLYSTSPAGPG/SGGLSPSQ 
GGPAGAGGDAG/LPGRCPSAPWRAGSRPAASCPDWIPGPQGLWL 
HRNPTS/GPPSQIGEGAEQGDEGVADAPQIQCKN/GAEDPPAED 
EPPQVPEAGEEDAVPAEEGPGGTPETQADQVRERPEAHIiAEGGA 
KG S PRRLADPQD LP AGQMS LAPP FP P VAAVI R SNK 


5729 


1 


1525 


AGGARE VLTLQLGHFAG F VGAHWWNQQDAALGRATDSKE PPGEL " 
CPDVLYRTGRTLHGQETYTPRLILMDLKGSLSSLKEEGGLYRDK 
QLDAAIAWOGKIiTTHKEELYPKNPYLQDFLSAEGVLSSDGVWRV 
KSIPNGKGSSPLPTATTPKPLIPTEASIRVWSDFLRVHLHPRSI 
CM I QKYNHDGEAGRLEAFGQGES VLKE P KYQEE LEDRLHFYVEE 
CDYL^GFQILCDLHDGFSGVGAKAAELLQDEYSGRGIITWGLLP 
GP YHRGEAQRN I YRLLNTAFGLVHLTAHSSLVCPLSLGGS LGLR 
PE PPVS FPYLHYDATLPFHCSAI LATALDTVTCS \ YRLCSS P VS 
MVHL \ ADMLS FCG KKWTAGAI I P FPLAPGQSLPDSLMQFGGAT 
PWTPLSACGEPSGTRCFAQSWLRGIDRACKTSQLTPGTPPPSA 
LHACTTGEEILAQYLQQQQPGVMSSSHLLLTPCRVAPPYPHLFS 
SCS PPGMVLDGSPKGAAVES VPVFG 


5730 


1258 


1713 


KKFQAPARETCVECQKTVYPMERLLANQQVFHISCFRCSYCNNK 
LSLGTYASLHGRIYCKPHFNQLFKSKGNYDEGFGHRPHKDLWAT 
K I E TEG FWERPRN FENQGR P LKS PGGEDCP S C * GG CPGSNY * AQ 
GSSSREKGGQASWNPKLRVA 


5731 


122 


443 


RSHRGELIPKDSCYMRKPPRRPKKRRQG/CALPQGCLTFKDVAI 
EFSLEEW KCLNPAQRALYRAVMLENYRNLESVGLTS KDSWYMRK 
KPGRGRGKQRRQEWFFLRVY 


5732 


226 


772 


PPSRSCQS PRRKS RRRAHVT.VTLVCGFTS FSFSLP LYLCGCLRF 
PERTCSQLQQADWAPDFGP S S FVPS WGATATGARKFLI AFNI \N 
LIXSTKEQAHRIALNLREO^RGKDQPGRLKKVQGIGWYLDEKNLA 
QVSTNLLDFEVTALHTVYEETCREAQELSLPWGSQLVGLVPLK 
ALLDAA 


5733 


1 


460 


PALQEVNANALAWGKQ YENDARTLFEFTSGVNDTES P 1 1 YRDES 
MRTACS PDGLCSLX3NGLELKCPFTSRDFMKFRLGG FEAIKSAYM 
AQVQYSMWVTRKNAWYFANYDPRMKREGLHYVVIERDEKYM\AS 
FDEI\VP\EFIGKMDEVLSRDPM 


5734 


3 


968 


RCNSPESLTSLLVLLTTANNLFVLI PAYS KNRAYAI FFIVFTVI 
GSLFLMNLLTAIIYSQFRGYLMKSLQTSLFRRRLGTRAAFEVLS 
S MVG EGGAF PQ AVG VK PQNLLQ VLQ KVQLDS SHKQ AMMEKVRS Y 
GSVLLSAEEFQKLFNELDRSWXEHPPRPEYQSPFLQSAQFLFG 
HYYFDYIjGNLIALANLVSICVFLVLDADV^ 
VFIVYYIiLEMLLKVFALGLRGYLSYPSNVFIXjLLTVVLLVLEIS 
TL\VCTDCHTQAGGRRWW/RLLSLWDMTRMLNMLI VFRFLRI I P 
SMKPMAWASTVLGL 


5735 


2 


540 


FFTPCVARAFNFPDQATVKKAAYSLPRVGGGTSCGLPQARR I S L 
ATPRQLYK/SSNMTQRWQRREISNFEYLMFLNTIAGRTYNDLNQ 
YP VFPW VLTNYES EELDLTLPGNFRDLSKP IGALNP KRAVF YAE 
RYETWBDDQSPPYHYNTHYSTATSTLSWLVRIVS I FI ELACLWY 
LKILT ! 


5736 


: 


382 


G TRPSTK KS G YS PQQ VAVIH CKGHQKENTAVAHSN Q KADSAAQV 
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SEQ 
ID 
NO: 
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beginning 

nucleotide 

location 

corresponding 

to first 

air.inO <±L.AtI 

residue of 
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Predicted end 
nucleotide 
location 
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to first 
amino acid 

IcblUUC KJ i. 

amino acid 
sequence 


nininu qciu se^niciiL v.ujiL.axn,xiig sjLgncix. pc^tiuc 
(A=Alanine, C=Cysteine, D=Aspartic Acid, Ea 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H»Histidine, I*Isoleucine, K=Lysine, 
LsLeucine, M=Methionine, N=Asparagine, 
P»Proline, Q=Glutamine / R-Arginine, 

C-Cpv-^ np T«Threoninp V=V«T i 

O — OCL llic i 1 ■ 1 ill ; V — V G X XiLC t 

W-Tryptophan, Y= Tyrosine, X= Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








X f\I\.LJ>j V X XT trVi LiLitr 1 v o c r\jtr UDr L/Vitr V 1 O X X X dI\.i-i/*o ULiTvf-vi'l fvIH 

QES**ILPDSGIFIP*T*TSYLQSTTHLRRAKLPQliLRR 


5737 


290 


1041 


KACLHLLSSFLTSNFLFNPLLPDSLYSVEARSQRANLGPCRRKR 
LQTLMRLAAG FQYS SHKDPS LS AKEKHTDYHNEARG P WPGWVG * 
RTADGS CGRGPDGAHHPGPKS SSWRASRLLPGLGGSHHLDAYVG 
RDLECGTPAPLQLB I P PQPRGHPAPIPTGQAGPRDSGPGAS P* V 
ETRPLTDGRR * PGVR P VGWT P AHPAGTLR PRGAVE P S VS ACG KW 
APS PTSQGCCEGRCDAVPiCHRAWRTPLCSQ 


5738 I 


8 [ 


460 


DTLS LNCTLPETLPMTPSF* LS FL* FPGLARAKS I PTKTYSNEV 
VTLWYRP PD I LLGSTD YSTQI DMW * GQVEVWQGPCGKGGGLVTT 
ATQrAAr lir 1 VPSDPKOVULl t I tiMATGKPJjr PGSTVEEQljrlr I 
FR I LS E E AWALCAVE THR 


5739 


1 


1222 


S FQRRGIRWNVHTLHPHPRAVWAGIGRGHGS *ALLGRARAPALC 

nnqir t t?t?t C C T ^DHT D7lT D JMpT TITUJ 7\ f 1 TyTUD TiC T OHT T XPlf 

r r 1 1 1-iLic.r U£*ah£*k'VLtrJ\ljKJ*M\3Lin±in 

SAEVDGPVPGYLSS PQSITDTCLYI FTSGTTGLPKAARISHLKI 
LQCQGFYQLCGVHQEDVIYLALPLYHMSOSLLGIVGCMGIGATV 
VLKSKFS AGQ FWEDCQQHRVTVFQ YIGELCRYLVNQP PS KAERG 
HKVRLAVGSGLRPDTWERFVRRFGPLQVLETYGLTEGNVATINY 

luUKvjAVtjKAi>WLil Kxl It tr roLlKl 1JV J. IVjr,Pl KJUlr'SJljtlLMA 1 b 

PGEPGLLVAPVSQQSPFLGYAGGPELAQGKLLKDVFRPGDVFFN 
TRDLLVCDDQG FLRFHDRTGDP FRWKGENVATTE VAEVFEALD F 

Li\2Cj V W v luV 1 V 


5740 


265 


231 


PAYWLKVPTLCLESKTDLREKASHVSAQLQGEVRGLAGALWM*A 
YVYERVYN*NISRMVHALEQKraPAGLSSSMAl^l^PCLGMx>lA 
LQSELHKLYDEETQSWVSGSACGGYP 


c i 'a -i 

5741 


1 


ot>U 


T3D VTMODr\7T MTT T/VICSMTT .DT .UTr'lfDTriDDDOr/Y'E T □ ft C/Tl 

YVARPGDKVAARVKAV1X3DEQWILAEVVSYSHATNKYEVDDIDE 
EGKERHTLSRRRVIPLPQWKANPBTDPEALFQKEQLVLALYPQT 
TC FYRAL THAP PQRPQDDYS VLFEDTS YADG YS P PLNVAQRYW 
ACKEPKKK* CRLADS PS PNDTGODSRGRAGI KH I PPLKKK 


5742 


2 


362 


TQSVKEILKRNPNVNLTDKDGNTALMIASKEGHTEIVQDLLDAG 

r i i v\rMTDnDCrtn , n7T TrifiXronnvxrv t wd ft t T/iwivnTnTDr , nnx7v 
1 x VIN 1 r JJKouJJ J. VLluftVKwnVaiVKnl tt lyf^I AUliJlKLiUUlNlV 

TAL YWAVEKGNATMVRDI LQCNPDTE I CTKDG 


5743 


2 


415 


GKTPEGIDAIEEIEIDLEETEREISPQENGLEEVKPLGEMQTDL 
KATGREISPREKTPEVIDATEEIDKDLEETGRREISPEENGPEE 
VKPVDEMETDLKTTGREGSSREKTREVIDAAEVIETDLEETERE 
ISPQE 


5744 


3 


703 


TRRTTTTS PTTTRQMTTT PAALPTTWTTPDLTTGTPLQMTT I A 

WWlMTfl OT TDCTT DVBXT^T.T TDCDC VUTDTT TBPC CPI/T D 

Vr 1 JAW 1 1 xjPinAI (ji-tiji Pr.iroivaCjr'l jux/Uioii 1 VLiP 
SDSWSSAESTSADTVLLTSKESKVWDLPSTSHVSMWKTSDSVSS 
PQPGASDTAVPEQNKTTKTGQMDGIPMSMKNEMPISQLLMIIAP 
S LGFVLFALFVAFLLRGKLMET YCSQKHTRLDY I GDS KNVLND V 
QHGREDEDGLFTL 


5745 


1400 


599 


G KSRFVNLMKHS KKT YDS FQDELEDY I KVQKARGLEPKTC FRXM 
KGDYLETCGYKGEVNSRPTYRMFDQRLPSETIQTYPRSCNIPQT 

opiUT.DiuncDT.DT.nCT.c vr , ni? , n?rv""E , QTrif ovpr-Mcnffrtnc 1 
V E»tNKijr ^ n JjrAMUoKJ-iKjjlJo J-io X X l\.L/\— " olLlvir V IrijN r IN vj\Jn< 

YICGSHGVEHRVYKHFSSDNSTSTHQASHKQIHQKRKRHPEEGR 
EKSEEERSKHKRKXSCEEIDLDKHKSIQRKKTEVEIETVHVSTE 
KL KNRKE KKSRDWS KKEERKRTKKKKEQGQERTEEEMLWDQS I 
LGF 


5746 


3 


821 


SFASGRLTPSSPAFDGELDLQRYSNGPAVSAWSLGMGAVSWSES 
RAGERRFPCPVCGKRFRFNS IlaALHLRTHQPERPRSPAARLLLE 
LEERALLREARLGRARSSGGMQATPATEGLARPQAPSSSAFRCP 
YCKGKFRTSAERERHLHILHRPWKCX5LCSFGSSQEEELLHHSLT 
AHGAPERPLAATSAAPPPQPQPQPPPQPEPRSVPQPEPEPQPER 
EATPTP APAAPEE P PAP P EFRCQVGGQS FTQS WFLKGHMRKHKA 
SFDHACPV 


5747 


2 


1328 


DRHVETLC I H FLGPSTGSTAKTGGRNWLKTGNCL YGNTCRFVHG 
PSPRGKGYSSNYRRSPERPTGDLRERIKNKRQDVDTEPQKRNTE 
ESSSPVRKESSRGRHREKEDIKITKERTPESEEENVEWETNRDD 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G^Glycine, 
H=HiBtidine, I=Isoleucine, K=*Lysine, 
L=Leucine, M=Methionine, N=iAsparagine, 
P=Proline, Q»Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W-Tryptophan, Y-Tyrosine, X=Unknovm, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SDNGDINYDYVHELSLEMKRQKIQRELMKLEQENMEKREEIIIK 
KEVSPEVVRSKLSPSPSLRKSSKSPKRKSSPKSSSASKKDRKTS 
AVS S PLLDQQRKS KTNQS KKKGPRTPS P P P P I PEDIALGKKYKE 
KYKVKDRI EEKTRDGKDRGRDFERQREKRDKPRSTS PAGQHHS P 
ISSRHHSSSSQSGSSIQRHSPSPRRKRTPSPSYQRTLTPPLRRS 
AS P YPSHSLSS PQRKQSPPRHRS PMREKGRHDHERTSQSHDRRH 
ERREDTRGKRDREKDSREEREYEQDQSSSRDHRDDREPRDGRDR 
RE 


5748 


934 


473 


SEGPQVFYKGLAPTLIAIFPYAGLQFSCYSSLKHLYKWAIPAEG 
KKNENLQNLLCGSGAG VI S KTLT Y PLDLFKKRLQVGG PEHARAA 
FGQVRRYKGLMDCAKQVLQKEGALGFFKGLSPSLLKAALSTGFM 
FFSYEFFCNVFHCMNRTASQR 


5749 


552 


1 


GFPVDPRVRGSTLSLAERPKGMIRSGSFRDPTDDVHGSVLSLAS 
S AS ST YSSAEERMQSEQ I RKLRRELES SQE KVATLTSQLS A^AN 
LVAAFEQSLVNMTSRLRHLAETAEEKDTELLDLRETIDFLKKKN 
SEAQAV IQGALNASETTP KELRI KRQNSSDS I SSLNS I TSHS S I 
GSSKDADA 


5750 


22 


866 


IFISICLWNAHLCFLLLPKDCIDQVMKI1QNLFVDDSGRYI1AIQF 
HL EW A YVFL YY YE YRKAKDQLD I AKD I S QLQ I DLTGALGKRTRF 
QENYVAQLILDVRREGDVLSNCEFTPAPTPQEHLTKNLELNDDT 
ILNDI KLADCEQFQMPDLCAEEIAI I LGICTNFQ KNNPVHTLTE 
VELLAFTSCLLSQPKFWAIQTSALILRTKLEKGSTRRVERAMRQ 
TQALADQFEDKTTSVLERLKIFYCCQVPPHWAIQRQLASLLFEL 
GCTSSALQIFEKLEMWE 


5751 


3 


751 


S CGS ALRAWRCGAAALAT FPAPAL PGLM YRAL YAFRS AE PNALA 
FAAGETFLVLERSSAHWWLAARARSGETGYVPPAYLRRLQGLEQ 
DVLQAIDRAIEAVHNTAMRDGGKYSLEQRGVLQKLIHHRKETLS 
RRGPSASSVAVWTSSTSDHHLDAAAARQPNGVCRAGFERQHSIiP 
SSEHLGADGGLFQIPLPSSQIPPQPRRAAPTTPPPPVKRRDREA 
LMASGSGGHNTMPSGGNSVSSGSSVSSCI 


5752 


3 


471 


GPVCGVGLS VAWAGPWRG P VHS VGGGGRAALHGAELPCLSGAAT 
VEREMELRHKNEMLRVETBARARAKAERENADI IREQIRLKASE 
HRQTVLE S I RTAGTL FGEG FRAFVTDRDKVTATVN I F I KQGWQ V 
AERQHVGASWS PRSCPCRLCTAL 


5753 


34 


483 


DDSXAI PGG VQAPFGAVRN I YTPRTGHR I RKLDQ I QSGGNYVAG 
GQEAFKKLNYLD IGEIKKRPMEWNTEVKPVIHSRINVSARFRK 
PLQEPCT I FLI ANGDLINPASRLL I PRKTLNQWDHVLQMVTE K I 
TLRSGAVHRLYTLEGRLV 


5754 


14 


331 


TLVHVVEFAGEHAFAIASREQEVLQGWKEIiLSACEDARLHVSST 
ADALRFHS Q VRDLLS WMDG I ASQ I GAADKP RCPS S LLG LPAS P W 
WPTPATPSPLTAPFSME 


5755 


3 


888 


LGDQ F Y KEA I EH CRS YNS RLCAE R S VRL P F LDSQTG VAQNNC Y I 
WMEKRHRGPGLAPGQLYTYPARCWRKKRRLHPPEDPKLRLLEIK 
PEVELPLKKDGFTSESTTLEALLRGEGVEKKVDAREEESIQEIQ 
RVLENDENVEEGNEEEDLEED I P KRKNRTRGRARG SAGGRRRHD 
AASQEDHDKPYVCDICGKEYKNRPGLSYHYAHTHXiASEEGDEAQ 
DQETRS P PNHRNENHRPQKG PDGTVIPNNYCDFCIiGGSNMNKKS 
GRPEELVSCADCGRSAHLGGBGRKEKEAAA 


5756 


3 


621 


SSKLQALFAHPLYNVPEEPPLLGAEDSLLASQEALRYYRRKVAR 
WNRRHKMYREQMNLTS LDPPLQLRLEAS WVQFHLG INRHGL YS R 
SSPWSKLLQDMRHFPTISADYSQDEKALLGACDCTQIVKPSGV 
HLKLVLRFSDFGKAMFKPMRQQRDEETPVDFFYFIDFQRHNAEI 
AAFHLDRI LDFRRVPPTVGRI VNVTKE I L 


5757 


3 


473 


YKDALLLPDNHRQWFENGTLKLTDVQKGMDEGEYLCSVLIQPQ 
LS I SQSVHVAVKVP P L I QPFEFPPAS IGQLLY I PC WS SGDMP I 
R I TWRKDGQVI I SGSG VTIES KEFMS SLQ I S S VSLKHNGN YTC I 
ASNAAATVSRERQLIVRVPPRFW 


5758 


1 


4 74 


FRRGAGAERGEHREGERGAAGMGEFKVHRVRFFNYVPSGIRCVA 
YNNQSNRLAVSRTDGTVEIYNLSANYFQEKFFPGHESRATEALC 
WAEGQRLFSAGLNGEI MEYDLQALNI KYAMDAFGGP I WSMAAS P 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of . 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D«=Aspartic Acid, E« 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LsLeucine, M=Methionine, N*Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W= Tryptophan, Y«Tyrosine, X=Dnknown # *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








SGSQLLVGCEDGS VKLFQITPDKI PV 


5759 


2 


1240 


GNAAFAGQG WYETFHMS DL PS YTTNGTVHWVNNQ IGFTTDPR 
MARS SPY PTDVAR WNAPI FHVNADDPEAVI YVCS VAAEWRNTF 
NKDVGADLVC YRRRGHNEMDE PMFTQPLMYKQ IHRQVPVLKKYA 
DKL IAEGTVTLQEFEEEI AKYDR ICEEAYGRSKDKKILHIKHWL 
DS PWPGFFNVDGEPKSMTCPATGI PEDMLTHIGSVASSVPLEDF 
KI HTGLSRI LRGRADMTKNRTVDWALAE YMAFGSLLKEG IHVRL 
NGODVERGTFSHRKIIVLHDQEVDRRTCVPMNHL.WPDQAPYTVCN 
S S LS E YGVLG FE LG YAMAS PNAL VLWE AQ FGD FHNTAQ C 1 1 DQ F 
ISTGQAKWVRHNGIVLLLPHGMEGMGPEHSSARPERFLQMSNDD 
SDAYPAFTKDFEVSQL 


5760 


1 


1221 


VRDITSDSLSLSWTVPEGQFDHFLVQFKNGDGQPKAVRVPGHED 
GVTISGLEPDHKYKMNliYGFHGGQRVGPVSAVGLTAPGKDEEMA 
PASTEPPTPEPPIKPRLEELTVTDATPDSLSLSWTVPEGQFDHF 
LVQYKNGDG Q P KATR VPGHEDR VT ISGLEP DNKY KMNLYG FHGG 
QRVGPVSAIGVTAAEEETPTPTEPSMEAPEPPEEPLLGELTVTG 
SSPDSLSLSWTVPQGRFDSFTVQYKDRDGRPQWRVGGEESEVT 
VGG LE PGRKYKMHLYGLHEGRR VGP VS TVG VTAP QED VDETPS P 
TEPGTEAPEPPEEPLLGELTVTGSSPDSLSLSWTVPQGRFDSFT 
VQYKDRDGRPQAVRVGGQESKVTVRGLEPGRKYKMHIiYGLHEGR 
RLGPVSAIGVT 


5761 


3 


1275 


S CDMAEAAAL VW I RGPGFGC KAVRCAS G RCTVRD F I HRHCQDQN 
VP VENF FVKCNGALINTSDTVQHGAVYS LEPRLCGGKGGFGSML 
RALGAQ I EKTTNREACRDLSGRRLRDVNHEKAMAEWVKQQAERE 
AEKEQKRLERLQRKLVEPKHCFTSPDYQQQCHEMAERLEDSVLK 
GMQAAS S KMVSAE I SENRKRQWPTKSQTDRGAS AGKRRC FWLGM 
EGLBTAEGSNSESSDDDSEEAPSTSGMGFHAPKIGSNGVEMAAK 
FPSGS QRARWNTDHGS PEQLQ I P VTDSGRH ILEDS CAELGESK 
EHMESRKVTETEETQEKKAES KEPI BEE PTGAGLNKDKETEERT 
DGERVAEVAPEERENVAVAKLQESQPGNAVIDKETIDLLAFTSV 
AELELLGLEKLKCELMALGLKCGGTLQ 


5762 


2 


344 


GSTGQTPLHSCGGGGGSGGGRRRTPRGMPKEKYEPPDPRRMYTI 
MS SE E AANGKKSHWAELE I SGKVRSLS AS LWSLTHLTALHLSDN 
SLSR I PSDIAKLHNLVYLDLSSNKIR 


5763 


3 


429 


LDKDTGL I MLIARLDYEL I QRFTLT 1 I ARDGGGEETTGRVR INV 
LDVNDNVPT FQKDAYVGALRENEPS VTQLVRLRATDEDS PPNNQ 
ITYSIVSASAFGSYFDISLYEGYGVISVSRPLDYEQISNGLIYL 
TVMAMDAGN 


5764 


19 


441 


VCARACGEMRQLLRPIDRQRYDENEDLSDVEEIVSVRGFSLEEK 
LRS QL YOGDFVHAMEGKDFNYE YVQREALRVP LI FREKDGLG I K 
MPDPDFTVRDVKLLVGSRRLVDVMDVNTQKGTEMSMSQFVRYYE 
TPEAQRDKL 


5765 


3 


825 


QKILRLNNSHQPPTSSSNSKDCGGPASSGAGATAALADGLKFAS 
VQAS APQGNSHKETS KSKVKRS KTS KDANKSLPSAALYGI PE I S 
STGKRQEVQGRPGEATGMNSALGQSVSSGGSGNPNSNSTSTSTS 
AATAGAGS CG KSKEEKPGKSQS S RGAKRD KD AG KSRKD KHDLLQ 
GHQNGSGS Q AP SGGHL YG FGAKS NGGGAS PFHCGGTGSGS VAAA 
GEVSKSAPDSGLMGNSMLVKKEEEEEESHRRIKKLKTEKVDPLF 
TVPAPPPHV 


5766 


1608 


663 


SGLFSVDPASSQAMELSDVTLIEGVGNEVMVVAGVVVLILALVL 
AWLST YVADSGS NQLLGA I VSAGDTS VLHLGHVDHLVAGQGNPE 
PTEL PHPS EGNDEKAEEAGEGRGDS TGEAGAGGGVEPSLEHLLD 
IQGLPKRQAGAGSSSPEAPLRSEDSTCLPPSPGLITVRLKFLND 
TEELAVARPEDTVGALKSKYFPGQESQMKLIYQGRLLQDPARTL 
RSLNITDNCVIHCHRSPPGSAVPGPSASLAPSATEPPSLGVNVG 
SLMVPVFWLIiGWWYFRINYRQFFTAPATVS LVGVTVFFS FLV 
FGMYGR 


5767 


2 


892 


nfratprpptrpelrtgteVilwyldwralmkrkrmkaniklvg 

SGFPLPSSDLDDSLTEEIDEKIGFRNDANFDWQNVADFRDAGGS 
1 LTEVXVEEEERDPQSPEFEIEEEEEMLSSVIPDSRRENELPDFP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, ^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, Fs Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N-Asparagine , 
P-Proline, Q«Glutamine , R=Arginine, 
S-Serine, T=Threonine, V*Valine, 
W= Tryptophan, Y= Tyro sine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








H I DE FFTLNS TPSRSA YDEPHLL VNT EKQKLELEKRR LDI EAER 
LQVEKERLQ I EKERLRHLDMKHERLQLEKERLQIERE KLRLQ I V 
NSEKPSLENELGOGEKSMLOPODIPTFKT.K'T,FRPRT,nT.PVnT?T/ , i 
FLKFESEKLQIEKERLQVEKDPJLRIQKEGHLQ 


5768 


3 


476 


SSRSRLS VS VS PPP PG I VELGP P FAWEFCSRLGS AVTS QRAG PA 
AAMVAKDYP FYLTVKRANCSLELPPASGPAKDAEEPSNKRVKPL 
SRVTSLANLIPPVKATPTjFflJF^OTT.nRc:T QPPQPQPDnTT addd 

WSRWAAPSS TKRRDSKLWS ETFD VC 


5769 


38 


667 


TKTKKGVKEKATDQSVKAFAEHCPELQYVGFMGCSVTSKGVIHL 
TKLRNLSS LDLRHI TELDNETAME I VKRCKNLI S LNLCLNWI IN 
DRCVEVTAKEGQNLKELYLVSCKITDYALIAIGRYSMTIETVDV 

PHI TFS TVLQDCKRTLERA YQMG WT PNMSAAS S 


5770 


\ 


4 84 


D^RRYDVKTPlfWQPTiT.PPUQ'KT.T AYTTDPT TJOVHT HDT TJTwpt t»t a 

FASQLKKTSLSLTPDVPEADLSEVDPKLVSNLMPFQRAGVNFAI 
AKGG RL LLAD DMG LG KT I Q AI C I AAF YRKEWP LL VWP S S VRFT 
WBQAFLRWLPSLS PDCINVWTGKDRLTA 


5771 


168 


741 


GLLPSACLRARSWREASEGPSSRACSNGSQDTFEACYSGTSTPS 
FHGSHCSGS DHS SLGLEQLQDYM VTLRS KLGPLE I QQFAMLLRE 
YRLGL P I OD YCTGLLKL YG DRRKFT .T .T DP T D nnn T Vwn V 
LEG VG I REGG I LTDS PGR I KRSMS S TS ASAVRS YDGAAQRPEAQ 
AFHRLLAD I THD I E 


5772 


148 


383 


E FNLALVS P SHPQ I KAEDDQPLPG VLLSLSGGLFRSNLLTQDNG 
ILTFSNLVTCS AI YHLPVFPERE PGCSMRDLRVA 


5773 


2 


723 


KIPLSQEEITLQGHAFEARIYAEDPSNNFMPVAGPLVHLS7PRA 
DPS TR I ETtjVRQGDEVSVHYDPM I AKLVVWAADRQAALTKLR YS 
LRQ YNI VGLHTNIDFLLNLSGHPE FEAGNVHTDFX PQHHKQLLL 
SRKAAAKESLCQAALGLILKEKAMTDTFTLQAHDQFSPFSSSSG 


5774 


2 


592 


F VE EEN IRWRCGGSELNFRRAVFSADS KYI FC VSGDFVKVYST 
VTEECVHILHGHRNLVTGIQLNPNNHLQLYSCSLDGTIKLWDYI 
DGILIKTFIVGCKLHALFTLAQAEDSVFVIVNKEKPDIFQLVSV 
KLPKSS SOEVRAlTET.QFVT.nYTNmc D vr"T ZIPrMP mrvT/a amain? 
YLS VYFFKKETTS R VTLS S S 


5775 


3 


53 8 


SSGCCDPAAPS S LAEAATM PVSKCPKKS ES LWKGWDRKAQRNGL 
RSQVYAVNGDYYVGEWKDNVKHGKGTQVWKKKGAIYEGDWKFGK 
RDGYGTLSLPDQQTGKCRRVYSGWWKGDKKSGYGIQFFGPKEYY 
EGDWCGSQRSGWGRMYYSNGDIYEGQWENDKPNGEGMLRLSQNP 
RP 


5776 


2 


484 


RLPQDCVCQNLSESI^TLCPSKGLIjFVPPDIDRRTVELRLGGNF 
IIHISRODFANMTGLVDTiTT.^PWTTciUTnPPCiPr.nT.PQr dot wt 

DSNRLPSLGEDTLRGLVNLQHLIVNNNQLGGIADEAFEDFLLTL 
EDLDLSYNNLHGPAVGLRGDAWVQPSTS 


Sill 


2 


949 


GQDPEPGQDLFQPEREVDPSWGRGREPRLGKLRFQNDHLSVLKQ 
VKKLEQALKDGS AGLDPQLPGTCYS PHC PPDKAEAGS TLPENLG 
GGSGSEVSQRVHPSDLEGREPTPELVEDRKGSCRRPWDRSLENV 
YR GSEGS P TKPF I NPL PKPRR TFKHAGEGDKDGKPG I G FRKE KR 
NLPPLP S LPP P PLPS S PPPS S VNRRLWTGRQ KSSADHRKS YE FE 
DLLQSS SESS RVDWYAQTKLGLTRTLSEENVYED I LD PPMKENP 
YEDIELHGRCLGKKCVLNFPAS PTSS I PDTLTKQSLSKPAFFRQ 
NSERRNV 


5778 


1 


1210 


QRRQSVSRLLLPVFLLEPPAEPGLEPPPEEEGGEPAGVAEEPGS 
GG P CWLQLE E VPGPGP LGGGG PLRS PS S YSS DELS PG E P LTS P P 
WAPLGAPERPEHLLNRVLERLAGGATRDSAASDILLDDIVLTHS 
LFLPTEKFLQELHQYFVRAGGMEGPEGLGRJCQACLAMLLHFLDT 
YQGLLQEEEGAGH 1 1 KDL YLL I MKDESLYQGLREDTLRLHQL VE 
T VE LKI P E ENQ P PS KQ VKPLFRHFRR I DS CLQTR VAFRG S DE I F 
CRVYMPDHSYVTIRSRLSASVQDILGSVTEKLQYSEEPAGREDS 
L I L VAVS S SG E KV LLQ PTEDCVFTALGINS HLFACTRDS YEALV 
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SEQ 
ID 
NO: 


Predicted 

K**rj innj 7"irr 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nu c 1 e 0 1 i de 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A^Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F^Phenylalanine, G«Glycine, 
H=Histidine, I=Isoleucine, K=Lyaine, 
L=Leucine, M-Methionine, N=Asparagine, 
P»Proline, Q=Glutamine, R^Arginine, 
S=Serine, T^Threonine, V= Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\~possible nucleotide insertion) 








PLPEEIQVSPGDTEIHRVEPEDVANHLTAFHWELFRCVHELEFV'" 
DYVFHGE 


5779 


138 


1671 


EAVQVLIKHSADVNARDKNWQTPLHVAAANKAVKCAEVIIPLLS ' 
SVNVSDRGGRTALHHAAliNGHVEMVNLLLAKGANINAFDKKDRR 
ALHWAAYMGHLDVVALL I NHGAEVTCKD KKG YTPLHAAASNGQ I 
NWKHLLNLG VE I DE I NV YGNTALH I AC YNGQDAWNE L I DYGA 
NVNQPNNNGFTPLHFAAASTHGALCLELLVNNGADVNIQSKDGK 
SPLHMTAVHGRFTRSQTLIQNGGEIDCVDKDGNTPLHVAARYGH 
ELLINTLITSGADTAKCGIHSMFPLHLAALMAHSDCCRKLLSSG 
QKYSIVSLFSNEHVLSAGFEIDTPDKFGRTCLHAAAAGGNVECI 
KLLQS S GAD FHKKDKCGRTPLH YAAANCHFHCI ETL VTTGANVN 
ETDDWGRTALHYAAASDMDRNKT I LGNAHDNS EELERARELKEK 
EATLCLEFLLQNDANPS I RDKEG YNS IHYAAAYGHRQCLELLLE 
RTNSG FEES DSGATKS PLHLAVS EM P 


5780 


154 


624 


QFFRVITCLPFKGPDYRLYKSEPELTTVAEVDESNGEBKSEPVS 
EIETSWKGSHFPVGWPPRAKSPTPESSTIASYVTIjRKTKKMM 
DLRTERPRSAVEQLCLAESTRPRMTVEEQMERIRRHQQACLREK 
KKG.LNVIGASDQS PLQS PSNLRDNP 


5781 


19 


941 


RGSLGGHPWRPPMRAASQGCLPVSFVTGPHQERAYGGRGPGGAF 

pappvsgtcppdliyaptpekaeggsqknhqpppgeraahrdge 
qapcragptrkvavaprppscp*gpe\pgeeprrpldrspplgq 
vqphftsqdaksaedeapsrhlgkhqprsaqvgsrldalqgpkt 

QHS IHTVTCKS PRQKEDRS PKPPQAPKHPEEHGRQS \QAPPPLP 

vapsrtcggc*twdpallvsp/pqgdstpelpap\qqptggpsr 
crqalppqg*rqqprqrpr/ptgasrshpakakgcqgppkirny 

NIMD 


5782 


5176 


1237 


DRSMMSMAADSYTDSYTDTYTEAYMVPPLPPEEPPTMPPLPPEE 
P PMTPPLPP EE PPEGPALPTEQSALTAENTWPTEVPSLPSEES V 
SQPEPPVSQSEISEPSAVPTDYSVSASDPSVLVSEAAVTVPEPP 

pepessitltpvesawaeehewperpvtcmvsetpamsaept 
vlaseppvmsetaetfdsmrasghvasevstsllvpavttpvla 

ESILEPPAMAAPESSAMAVLESSAVTVLESSTVTVLESSTVTVL 
E ?S WTVPE PPWAE PDYVT I PVP WSALEPSVPVLSPAVS VLQ 
PSMIVSEPSVSVQESTVTVSEPAVTVSEQTQVIPTEVAIESTPM 
ILESSIMSSHVMKGINLSSGDQNLAPEIGMQEIALHSGEEPHAE 
E HLKGD F YE S EHG IN IDLNINNHLI AKEMEHNTVCAAGTS PVGE 
I GEEKILPTSETKQRTVLDTYPGVSEADAGETLS S TGP FALE PD 
ATG\TSKGI EFTTASTLSLVNKYDVDLS LTTQDTEHDMLISTSP 
SGGSEADIEGPLPAKDIHLDLPSNINLVSSDTNEPLPVKRD\DQ 
TLAALI\SLKESSGGEKEVPPPS*REHLPDSGFSANIEDINEAD 
LVRPVSSPRTWNVLPSPRAGL\EGP\LIASDFGPVQNLYSSPW 

\ssmp\erasgs\ssgekgg\yeifvkvicdthekskknknrdkg 

EKEKKRDSSLRSRSKRSKSSEHKSRKLTSESRSRARKRSSKSKS 
HRS\QTRSRSRS/RDRRRRSSRSRSKSRGRRSVSKEKRKRSPKH 
RSKSRERKRKRSSSRDNRKTVRARSRTPSRRSRSHTPSRRRRSR 
SVGRRRSFSISPSRRSRTPSRRSRTPSRRSRTPSRRSRTPSRRS 
RTPSRRSRTPSRRRRSRSWRRRSFSISPVRLRRSRTPLRRRFS 
RSPIRRKRSRSSERGRSPKRIiTDLDKAQLLEIAKANAAAMCAKA 
GVPLPPNLKPAPPPTIEEKVAKKSGGATIEELTEKCKQIAQSKE 

DDDVrVNKPHVSDEEEEEPPFYHHPFKLSEPKPIFFNLNIAAAK 
PT P P KS QVTLT KE FP VS SGSOHR KKEADS VYR FWVD wvvrr ccw 

KDDDNVFSSNLPSEPVDISTAMSERALAQKRLSENAFDLEAMSM 
LNRAQERI DAWAQLNS I PGQFTGS TGVQVLTQEQLANTGAQAW I 
KKDQFLRAAP VTGGMGAVLMRKMGWREGEGLGKNKEGNKE P ILV 
DFKTDRKGLVAVGERAQKRSGNFSAAMKDLSGKHPVSALMEICN 
KRRWQP PE FLLVHDS G PDHRKHFLFRVL I NGS AYQPNCMFFLNR 
Y 


5783 


1693 


698 


DSGLRVAFTMEGISNFKTPSKLSEKKXSVLCSTPTINIPASPFM 
QKLGFGTGVNVYLMKRSPRGLSHS PWAVKKINPI CNDHYRSVYQ 
KRLMDEAKILKSLHHPNIVGYRAFTEANDGSLCLAMEYGGEKSL 
NDLIEE/ PI *SQ/ PKILFQQP/LILKVALNMARGLKYLHQEKKL 



370 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 


Predicted 
beginning 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, 1=1 so leucine, K-Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q»Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=apossible nucleotide insertion) 








LHGD I KSSNWI KGDFETI KI CD VGV S L PLDENMTVTD PEAC Y I 
GTEPWKPKEAVEENGVITDKADI FAFGLTLWEMMTLS I PHINLS 
NDDDDE DKTFDESD FDDE A YY AALGTR P P I NMEELDE S YQKVIE 
LFS VCTNED PKDR PSAAHI VEALETDV 


5784 


2669 


1388 


PR VRPR VRTDHNY Y I S R I YG P S DS ASRDLWVNI DQM E KD KVK I H 
G I LSNTHRQAARVNLS FDFPFYGHFLRE ITVATGGF I YTGEWH 
RMLTATQYIAPLMANFDPSVSRNSTVRYFDNGTALWQWDHVHL 
QDNYNLGS FT FQATLLMDGR 1 1 FG YKE I PVLVTQ I S STNHP VKV 
GLSDAFVWHRIQQ I PNVRRRT I YE YHR VELQMS KITNI SAVEM 
TPLPTCLQFNRCGPCVSSQIGFNCSWCSKLQRCSSGFDRHRODW 
VDSGCPEESKEKMCENTEPVET\FLEPPQP*ERQPPSSGS*LPP 
E/DAVTSQFPTSLPTEDDTKIALHLKDNGASTDDSAAEKKGGTL 
HAGLI VG I LI LVL I VATA IL VTVYM YHH PTS AAS I FF I ERJRPS R 
WPAMKFRRGSGHPAYAEVEPVGEKEGFIVSEQC 


5785 


2669 


1388 


PRVRPRVRTDH^TY^tSRIYGPSDSASRDLWVNIDQMEKI)KVKIH 
G I LSNTHRQAARVNLS FDFPFYGHFLRE I TVATGGFI YTGEWH 
RMLTATQYIAPLMANFDPSVSRNSTVRYFDNGTALWQWDHVHL 
QDNYNLGS FT FQATLLMDGR 1 1 FGYKE I P VLVTQIS STNHPVKV 
GLSDAFVWHR IQQ I PNVRRRT I YE YHRVELQMSKI TNI SAVEM 
TpT.PTfT .OFNT? CGV CVS TG FNCS WCS KLORCSSGFDRHRODW 
VDSGCPEESKEKMCENTEPVET\ FLEPPQP* ERQPPSSGS * LPP 
E/DAVTSQFPTSLPTEDDTKIALHLKDNGASTDDSAAEKKGGTL 
HAGLIVGILILVLIVATAILVTVYMYHHPTSAASIFFIERRPSR 
WPAMKFRRGSGHPAYAEVEPVGEKEGFIVSEQC 


5786 




1674 


SYKLPAAERRASSCSQPPTPTRRRWPAPGRTSRGHRPQM*SGTP 
APRPPARSTVSPASPLPKPRAGRCGSRPRSACSTFRPC*SLN*M 
S * H * KRNLSQRSSSMSRRPLSCARPHR* *RQGLTVAARLPTWAK 
SPPLACSFCQAAQKSQSLSSGRSTR*PERMSFRP\SPPGNPAIP 
SLAPSSRP/PKGRPQCTWIPSRWPASPTAPPTTT*APTSSPGST 
GRSMMTCPTRWTATPWSARASSRPRNWPTP*WRPSGRLSTV*RA 
TGGSTATAPPKRFPRNWNPMMAE 


5787 


2 


14 60 


Mfi c; i jv qvt^T.ADRVNCPX ICOGTLKEAGSLSNCG/HKNFCRACL 
T\RYCEIP\GPD\LEESP\TCP\LCKEPFRP\GSFRPNWQLANV 
VENIERLQLVSTLGLGEEDVCQEHGEKIYFFCEDDEMQLCWCR 
EAGEHATHTMRFLEDAA\APYREQIHKCLKCLIKEREEIQEIQS 
RENKRMQVLLTQVSTKRQQVI SEFAHLRKFLEEQQS ILLAQLES 
QDGD I LRQRDE FDLLVAGE I CRFSAL I EELEEKNERPARELLTD 
IRSTLIRCETRKCRKPVAVSPELGQRIRDFPQQALPLQREMKMF 
LE KLCFELD YE PAH I S LD P QT S HP KLLLSEDHQRAQ FS YKWQNS 
PDNP Q R FDRAT C VLAHTG I TGGRHTWWS I DLAHGG S CTVG VVS 
EDVQRKGELRLRPEEGVWAVRLAWGFVSALGSFP\TRLTLKEQP 
RQVRVSLDYEVG WVTFTNAVTREP I YTFTASFTRKVI PF FGLWG 
RGSSFSLSS 


5788 


2 


6860 


EHS VSGRS SAYGDATAEGH PAGPGS VS SS TGAI STTTGHQEGDG 
S EGEGEGETEGDVHTSNRLHMVRLMLLERLLQTLPQLRNVGGVR 
AIPYMQVILMLTTDLDGEDEKDKGALDNLLSQLIAELGMDKKDV 
S KKNERSALNEVHLWMRLLS VFMS RTKSGSKSS ICESSSL I S S 
ATAAALLSSGAVD Y CLHVLKS LLEYW KS QQNDE E P VATS QLLKP 
HTTSSPPDMSPFFLRQYVKGHAADVFEAYTQLLTEMVLRLPYQI 
KKI TDTNS RIP P P VFDHSW F Y FLSE YLM I QQTP FVRRQ VRKLLL 
F I CG S KE KYRQ LRDLHTLD S \ H VRG I KKLLEEQG I FLRASWTA 
S P QSALQ YDTL I S LMEHLKACAE I AAQRT I NWQKFC I KDDS VL Y 
FLLQVS FLVDEG VS P VLLQLLS CALCGS KVLRALAAS SGSS SAS 
S S PAP VAASSGQATTQS KS STKKS KKEEKE KEKDGETSGSQBDQ 
LCTALVNQLNKFADKETLIQFLRCFLLESNSSSVRWQAHCLTLH 
I YRNSS KSQQELLLDLMWS I WPELPAYGRKAAQFVDLLGYFS LK 
T PQTEKKLKE YSQKAVE ILRTQNHI LTNH PNSN I YNTLSGLVE F 
DG YYLE S DPCLVCNN P E VP FC Y I KLS S I KVD TR YTTTQQ WKL I 
GS HT I S KVTVKIGDLKRTKMVRT INLYYNNRTVQAI VELKNKPA 
RWHKAKKVQLTPGQTEVKIDLPLPIVASNLMIEFADFYENYQAS 
T ET LQCPRCS AS VP AN PG VCGN CGENVYQCHKCRS I N YDE KD P F 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H=Histidine, I=Isoleucine, K*Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
?=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V -Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








L.CNACGFCKYARFDFMLYAKPCCAVDP I ENEEDRKKAVSNINTL 
LDKADRVYHQLMGHRPQLENLLCKVNEAAP3KPQDDSGTAGGI S 
STSASVNRYILQLAQEYCGDCKNSFDELSKriQKVFASRFCELLE 
YDLQQREAATKSSRTS VQPT FTASQYRALS VLGCGHTS S TKCYG 
CAS AVTEHCI TLLRALATNPALRHILVS QGL I RELFD YNLRRGA 
AAMREE VRQLMCLLTRDNP EATQQMNDL 1 1 G KVS TALKGHW ANP 
DLASSLQYEMLLLTDSISKEDSCWELRLRCALSLFLMAVNIKTP 
WVEN I TLMCLR I LQ KL I KP PAPTS KKNKDVP VEALTT VKP Y CN 
E I HAQAQLWLKRD P KAS YD AWKKCLP I RG I DGNGKAP S KS ELRH 
LYLTEKYVWRWKQFLSRRGKRTS PLDLKLGHNNWLRQVLFTPAT 
QAARQAACTIVEALATIPSRKQQVLDLLTSYLDELSIAGECAAE 
YLAL YQKL I TS AHW KVYLAARG VLP YVGNL I TKE I ARLLALEEA 
TLSTDLQQGYALKS LTGLLSS FVEVES I KRHFKSRLVGTVLNGY 
LCLRKLWQRTKLIDETQDMLLEMLEDMTTGTESETKAFMAVCI 
ETAKRYNLDDYRTPVFIFERLCSIIYPEENEVTEFFVTLEKDPQ 
QEDFLQGRMPGNPYSSNEPGIGPLMRDIKNKICQDCDLVALLED 
DSGMELLVNNKIISLDLPVAEVYKKVWCTTNEGEPMRIVYRMRG 
LLGDATEEF I ES LDS TTDEEEDEEE VYKMAGVMAQCGGLECMLN 
RLAGIRDFKQGRHLLTVtiLIOjFSYCVKVKVNRQQLVKLEMNTIiN 
VMLGTLNLALVAEQESKDSGGAAVAEQVLS I ME I \ IQAEPNVEP 
LSEDKGNLLLTGDKDQLVMLLDQINSTFVRSNPSVLQGLLRIIP 
YLSFGEVEKMQILVERFKPYCNFDKYDEDHSGDDKVFL\DCFCK 
I AAG I K\NNSNGHQL\ KDL\ ILQKG I TQNALD\ YMKKHI P / SAA 
RIWDADI\WKSFCLRPALPFILRLLRGLAIQHPGTQVLIGTDSI 
PNLHKLEQ VS \ S DEG I GTLA\ EN L \ LE S LREHP D VNKKI DA \ AR 
RETRAEKXRMAMAMRQKALGTLG\MTTNEKGQVVD/TRTALLEA 
DWEELIEEP\GLTCCICREGYKFQPTKVIiGlYTFTKRWLGGVW 
ENKPRETSRATSTVSHFNIVHYDC\HLA\AVSLARGREEWESAA 
LQNANTKCNGLLPVWGPHVPESAFATCLARHNTYLQECTGQREP 
T YQLNI HDI KLLFLRFAMEQS FSADTGGGGRESNI HLI P YI IHT 
GLYVLNTTRATSREEKNLQGFLEQPKEKWVESAFEVDGPYYFTV 
LALH I LPPEQWRATRVE I LRRLLVTSQARAVAPGGATRLTDKAV 
KDYSAYRSSLLFWALVDLIYNMFKKVPTSNTEGGWSCSLAEYIR 
HNDMP I YEAADKALKTFQE EFMPVETFSEFLDYAGIiLSE ITDPE 
SFLKDLLNSVP 


5789 


1 


2407 


LPLHAVE KTGR PGQPALKM PGKLRSDAGLESDTAMKKGETLRKQ 
TEEKEKKEKPKSDKTEEIAEEEETVFPKAKQVKKKAEPSEVDMN 
S PKS KKAKK\ KEEPSQND I S P KTKSLRKKKEP I EKKWS SKTKK 
VTKNEEPSEEEIDAPKPKKMKKEKEMNGETREKSPKLKNGFPHP 
EPDCNPSEAASEESNSEIEQEIPVEQKEG\AFSNFPISEETIKL 
LKGRGVTFLFP I QAKTFHHVYSGKDL I AQARTGTGKTFS FAI PL 
I EKLHG\ELQDRKRGRAPQVLVLAPTRELANQVSKDFSDITKKL 
S VACF YGGTP YGGQ FERMRNG I D I LVGT PGRI KDH I QNGKLDLT 
KLNHVVLDEVDQMLDMGFADQVEEILSVAYKKDSEDNPQTLLFS 
ATCPHWVFNVAKKYMKSTYEQVDLIGKKTQKTAITVEHLAIKCH 
WTQRAAVIGDVIRVYSGHQGRTIIFCETKKEAQELSQNSAIKQD 
AQSLHGDI PQKQRE ITLKGFENGSFGVLVATNVAARGLD IPEVD 
L VI QS S P P KD VE S Y I HRSGRTGRAGRTG VC I CFYQH KEE YQ LVQ 
VEQKAG I KFKRIGVP SATE 1 1 KASSKDAIRLLDS VP PTAI SHFK 
QS AEKL I E E KGAVEALAAALAH I SCATS VDQRS L I NSNVG FVTM 
ILQCS I EMPNIS YAWKE LKEQLGEE IDS KVKGMVFLKGKLGVCF 
DVPTASVTEIQEKWHDSRRWQLSVATEQPELEGPREGYGGFRGQ 
REGSRGFRGQRDGNRRFRGQREGSRGPRGQRSGGGNKSNRSQNK 
GQKRSFSKAFGQ 


5790 


3786 


1585 


ARRQRDPLQALRRRNQELKQQVDSLLSESQLKEALEPNKRQHIY 
QRCI QLKQAI DENKNALQKLS KADESAP VANYNQRKEEEHTLLD 
KLTQQLQGLAVTIS REN I TEVGAPTE EEE ES ES EDSEDSGGEEE 
DAEEEEE EKEENESHKWSTGEE Y I AVGDFTAQQVGDLTFKKGE I 
LL V I EKKPDGWW IAKD AKGNEGL VPR TYLE P YS EEEEGQES SEE 
GS EEDVEAVDETADGAE VK\QRTDPHWSAVQKAI SEAG I FCLVN 
HVS FCYL I VLMRNRME TVEDTNG S ETG FRAWNVQ SRGR I FL VS K 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V-Valine, 
W-Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








PVLMINTVDVLTTMGAIPAGFRPSTLSQLLEEGNQFRANYFLQ 
PELMPSQIiAFRDLMWDATEGTIRSRPSRISLILTLWSCKMIPLP 
GMS IGVLSRHVRLCLFDGNKVLSN IHTVRATWQP KKPKTWTPS P 
QVTRILPCLLDGDCFIRSNSASPDLGILFELGISYIRNSTGERG 
ELS CGWVFL KLFDASGVP I P AKT YEL FLNGGTP YE KG I E VD PS I 
SRRAHGSVFYQIMTMRRQPQLLVKLRSLNRRSRNVLSLLPETLI 
GNMCSIHLLIFYRQILGDVLLKDRMSLQSTDLISHPMLATFPML 
LEQPDVMDALRSSWAGQES\TLKRSEKR\PKEFLKVPRFLLVYH 
\GCVLPLL/HTPTRLPPFRWAEEETETARWKVITDFLKQNQENQ 
GALQALLSPDGVHEPFDLSEQTYDFLGEMRKNAV 


5791 


3 


1636 


LRVAE FAGTSR/ IGAGLI QP LHRAPARDHGLLRGGAAPALS VS H 
GN/GKQL/AMSSQGSDDEQIKRENIRSLTMSGHVGFESLPDQLV 
NRSIQQGFCFNILCVGETGIGKSTLIDTLFNTNFEDYESSHFCP 
NVKIiKAQTYELQESNVQLKLT I VNTVGFGDQINKEES YQP IVDY 
IDAQFEAYLQEELKIKRSLFTYHDSRIHVCLYFISPTGHSLKTL 
DLLTMKNLDS KVYI I P VI AKADTVS KTE LQKFKI KLMS ELVSNG 
VQI YQFPTDDDTIAKVNAAMNGQLPFAWGSMDEVKVGNKMVKA 
RQY PWG WGVENENHCDFVKLREML I CTNMEDLREQTHTRHYEL 
YRRCKLEEMGFTDVGPENKPVSVQETYEAKRHEFHGERQRKEEE 
MKQMFVQRVKEKEAILKEAERELQAKFEHLKRLHQEERMKLEEK 
RRLLEEEIIAFSKKKATSEIFHSQSFLATGSNLRKDKDRKNSQF 
FVKQKVPEHRRSSSQANFI KKKLEVCFDFAVI CFITS I FGEQPQ 
LLI FMEKYFQVQGQYISQSE 


5792 


2263 


653 


AAAAPSPAWWCGVFWYWHTCWVMYGIVYTRPCSGDASCIQPY 
LARRPKLQL\RHSFTTTRSHLGAEWNIDLVLNVEDFDVESKFER 
TVNVSVPKKTPJJNGTLYAYI FLHHAGVLPWHDGKQVHLVS PLTT 
YMVPKPEEINLLTGESDTQQIEADKKPTSALDEPVSHWRPRLAL 
NVMADNFVFDG S S LP AD VHR YMKM I QLG KT VH YL PILFIDQLSN 
RVKDLMVINRSTTELPLTVSYDKVSLGRLRFWIHMQDAVYSLQQ 
FGFSEKDADEVKGIFVDTNLYFLALTFFVAAFHLLFDFLAFKND 
I S FWKKKKS M I GMSTKAVLWRCF S TW I FL FLLDEQTSLL VL VP 
AGVGAAIELWKVKKALKMTIFWRGLMPEFQFGTYSESERKTEEY 
DTQAMKYLSYLLYPLCVGGAVYSLLNIKYKSWYSWLINSFVNGV 
YAFGFLFMLPQLFVNYKLKSVAHLPWKAFTYKAFNTFIDDVFAF 
1 1 TM PTS HRLAC FRDD WFL VYL YQRWL YPVDKRRVNE FGE S YE 
EKATRAPHTD 


5793 


2263 


653 


AAAAPSPAWWCGVFVVYVVHTCWVMYGIVYTRPCSGDASCIQPY 
LARR P KLQL\ RHS FTTTRS HLG AENN I DL VLNVE DFD VE S KFER 
TVNVSVP KKTRNNGTLYAYI FLHHAGVLPWHDGKQVHLVS PLTT 
YMVP KPEE INLLTGESDTQQ I EADKKPTS ALDEP VSHWRPRLAL 
NVMADNFVFDX3SSLPADVHRYMKMIQIX3KTVHYLPILFIDQLSN 
RVKDLMVINRSTTELPLTVSYDKVSLGRLRFWIHMQDAVYSLQQ 
FGFS E KDADE VKG I FVDTNLYFLALTFFVAAFHLLFDFLAFKND 
ISFWKKKKSMIGMS TKAVL WRCFS TWI FLFLLDEQTSLLVLVP 
AGVGAAIELWKVKKALKMTIFWRGLMPEFQFGTYSESERKTEEY 
DTQAMKYLS YLL YPL C VGGAVYS LJLN T I K YKS WYS WL INSFVNG V 
YAFGFLFMLPQLFVNYKLKSVAHLPWKAFTYKAFNTFIDDVFAF 
I ITMPTSHRLACFRDDWFLVYLYQRWLYPVDKRRVNEFGESYE 
EKATRAPHTD 


5794 


1 


5016 


MGPRLSVWLLLLPAALLLHEEHSRAAAKGGCAGSGCGKCDCHGV " 
KGQKGERGLPGLQGVIGFPGMOGPEGPQGPPGQKGDTGEPGLPG 
TKGTRG P PGASG YPGNPGL PG I PGQDGP PGP PG I PGCNGTKGER 
GPLGPPGLPGFAGNPGPPGLPGMKGDPGEILGHVPGMLLKGERG 
FPGIPGTPGPPGLPGLQGPVGPPGFTGPPGPPGPPGPPGEKGQM 
GLS FQGPKGDKGDQG VSGP PGVPGQAQVQE KGDFATKGEKGQKG 
EPGFQGMPGVGEKGEPGKPGPRGKPGKDGDKGEKGSPGFPGEPG 
YPGLIGROGP\QGEKGEAGPPGPPGIVIGTGPLGEKGERGYPGT 
PGPRGEPGPKGFPGLPGQPGPPGLPVPGQAGAPGFPGERGEKGD 
RG F PGTS L PG PSGRDG LPG P PG S PGPPG Q PG YTNG I VE CQ PG P P 
GDQGPPG I PGQPGF IGE I GEKGQKGESCL I CD I DG YRG P PGPQG 
P PG E I G F PGQ PGAKGDRGLPGRDG VAGVPG POGTPGL I GQPGAK 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide^" 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K-Lyeine, 
L=Leucine, M-Methionine , N=Asparagine , 
P=Proline, Q-Glutamine, R=Arginine, 
S-Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








GEPGEFYFDLRLKGDKGDPGFPGQPGMPGRAGSPGRDGHPGLPG 

PKGSPGSVGLKGERGPPGGVGFPGSRGDTGPPGPPGYGPAGPIG 

DKGQAGFPGGPGSPGLPGPKGEPGKIVPLPGPPGAEGLPGSPGF 

PGPQGDRGFPGTPGR\PGL\PGEKGAVG\QPGIGFPGPPGPKGV 

DGLPGDMGPPGTPGRPGFNGLPGNPGVQGQKGEPGVGLPGLKGL 

PGLPGIPGTPGEKGSIGVPGVPGEHGAIGPPGLQGIRGEPGPPG 

LPGSVGSPGVPGIGPPGARGPPGGQGPPGLSGPPGIKGEKGFPG 

FPGLDMPG P KGD KGAQG LPG I TGQSGLPGLPGQQGAPG I PG F PG 

SKGEMGVMGTPGQPGSPGPWGAPGLPGEKGD\HGFPGSSGPRGD 

PGLKGDKGDVGLPGKPGSMDKVYMGSMKGQKGDQGEKGQIGPIG 

EKGSRGDPGTPGVPGKDGQAGQPGQPGPKGDPGISGTPGAPGLP 

GPKGSVGGMGLPGTPGEKGVPGIPGPQGSPGLPGDKGAKGEKGQ 

AGPPGIGIPGLRGEKGDQGIAGFPGSPGEKGEKGSIGIPGMPGS 

PGLKGSPGSVGYPGSPGLPGEKGDKGLPGLDGIPGVKGEAGLPG 

TPGPTGPAGQKGEPGSDGIPGSAGEKGEPGLPGRGFPGFPGAKG 

DKGSKGEVGFPGLAGSPGIPGSKGEQGFMGPPGPQGQPGLPGSP 

GHATEGPKGDRGPQGQPGLPGLPGPMGPPGLPGIDGVKGDKGNP 

GWPGAPGVPG PKGDPGFQGMPGIGGS PGITGSKGDMGPPGVPGF 

QG PKG LPG LQG I KGDQGDQG VPGAKG LPGPPGPPGPYDIIKGEP 

G L»PG P EG P PGL KG tiQGL PG P KGQQG VTGLVG I PG P PG I PG FDG A 

PGQKGEMGPAGPTGPRGFPGPPGPDGLPGSMGPPGTPSVDHGFL 

VTRHSQTIDDPQCPSGTKILYHGYSLLYVQGNERAHGQDLGTAG 

SCLRKFSTMPFLFCNINNVCNFASRNDYSYWLSTPEPMPMSMAP 

ITGENIRPFISRCAVCEAPAMVMAVHSQTIQIPPCPSGWSSLWI 

GYSFVMHTSAGAEGSGQALASPGSCLEEFRSAPFIECHGRGTCN 

YYANAYS FWLAT I ERSEMFKKPTPSTLKAGELRTHVSRCQVCMR 

RT 


5795 


1192 


61 


STRSPTVEYISAHPHILFMLLKGYEAPQIAIjRCGIMLRECIRHE 
PLAKI I L FSNQ FRD F F KYVE LS T FDIAS D AFAT FKDLLTRH K VL 
VADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRHN 

faimtkyiskpenlklmmnllrdkspniqfeafhvfkvfvasph 
ktqpiveillknqpklieflssfqkertddeqfadeknylikqi 
rdlkktap * ralrdskr 


5796 


2 


1078 


grvgwelwcmyisppkdwwdagdpslpirtpamigcsfvvnrkf " 

fgeiglldpgmdvyggenielgikvwlcggsmevlpcsrvahie 

rkkkpynsnigfytkrnaiirvaevwmddyicshvyiawnlplenp 

GIDIGDVSERRAIJIKSLKCXNFQWYLDHVYPEMRRYNNTVAYGE 

lrnnkakdvcldqgplenhtailypchgwgpqlarytkegflhl 
galgtttllpdtrclvdns ksrlpqlldcdkvkss l ykrwnfiq 
ngai mnkgtgrcle venrglag i dl i lrs ctgqrwti kns x k* r 

EGAGALE pgpqdmaap pni wts CPGGETARGRQVLDGPPRASPG 
QHRDPG 


5797 


2 




P R VRQ KTL VD VTLENSN I KDQ I RNLiQQT YEAS MD KLRE KQRQLE 
VAQVENQLLKMKVESSQEANAEVMREMTKKLYSQYEEKLQEEQR 
KHSAEKMLLEETNSFLKAIEEANKKMQAAEISLEEKDQRIGEL 
DRLIERMEKERHQLQLQLLEHBTEMSGELTDSDKERYQQLEEAS 
ASLRERIRHLNDMVHCQQKKVKQMVEEIESLKKKLQQKQLLILQ 
LLEKISFLEGENNELQSRLDYLTETQAKTEVETREIGVGCDLLP 
SQTGRTREIVMPSRNYTPYTRVLELTMKKTLT 


5798 


644 


115 


KILGSRWKSMSNQEKQPYYEEQARLSKIHLEKYPNYKYKPRPKR " 
TCIVDGKKLRIGEYKOIjMRSRROEMROFFTVfinnpnTPTTTr'Tn 

WYPGAITMATTTPSPQMTSDCSSTSASPEPSLPVIQSTYGMKT 
DGGSLAGNEMINGEDEMEMYDDYEDDPKSDYSSENEAPEAVSAN 


5799 


2679 


1435 


LLSTYIKFINLFPETKATIO^VLRAGSQLRNADVELQQRAVEYL " 

TLSSVASTDVLATVLEEMPPFPERESSILAKLKRKKGPGAGSAL 

DEXjRRDPSSNDINGGMEPTPSTVSTPSPSADLLGLRAAPPPAAP 

PASAGAGNLLVDVFDGPAAQPSLGPTPEEAFLSPGPEDIGPPIP 

EADBLUJKWCKNNGVLFENQLLQIGVKSEFRQNLGRMYLFYGN 

KTS VQ FQNFS PTWHPGDLQTQLA VQ 7KR VAAQVDGGAQVQQ VL 

WIECLRDFLTPPLLSVRFRYGGAPQALTLKLPVTINKFFQPTEM 

AAQDFFQRWKQLSLPC^EAQKIFKANHPMDAEVTKAKLLQFGSA 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E = 
Glutamic Acid, F=Phenyl alanine, G-Glycine, 
H=Histidine, I=.Isoleucine, K-Lysine, 
L=Leucine, M-Methionine, N=Asparagine, 
P=Proline, G>Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LLDNVD PN P ENFVGAG 1 1 QTKALQVGCLLRLE PNAQ AQMYRLT L 
RTSKEPVSRHLCELLAQQF 


5800 


2679 


1435 


LLSTYIKFINLFPETKATIQGVLRAGSQLRNADVELQQRAVEYL 
TLSSVASTDVLATVLEEMPPFPERESSIIiAKLKRKKGPGAGSAL 
DDGRRDPSSNDINGGMEPTPSTVSTPSPSADLLGLRAAPPPAAP 
PASAGAGNLLVDVFDGPAAQPSLGPTPEEAFLSPGPEDIGPPIP 
EADELLNKFVCKNNGVLFENQLLQ I GVKSE FRQNLGRMYLFYGN 
KTS VQ FQN F S P TWH PGDLQTQLAVQT KR VAAQ VDGGAQ VQQVL 
NIECLRDFLTPPLLSVRFRYGGAPQALTLKLPVTINKFFQPTEM 
AAQDFFQRWKQLSLPQQEAQKI FKANHPMDAEVTKAKLLGFGSA 
LLDNVD PNP ENFVGAG 1 1 QTKALQVGCLLRLE PNAQAQMYRLTL 
RTSKEPVSRHLCELLAQQF 


5801 


3 


1413 


FPRLYHLIPDGEITSIKINRVDPSESLSIRLVGGSETPLVHIII 
QHIYRDGVIARDGRLLPGD1ILKVNGMDISNVPHNYAVRLLRQP 
CQVLWLTVMREQKFRSRNNGQAPDAYRPRDDSFHVILNKSSPEE 
QLGT KLVRKVDEPGVFI FNVLDGGVAYRHGQLEENDRVLAINGH 
DLRYGSPESAAHLIQASERRVHLWSRQVRQRSPDIFQEAGWNS 
NG S WS PG PGERS NTP KP LHPT I TCHEKWN I QKDPGE S LGMTVA 
GGASHRE WDLP I YVI S VEPGGVI SRDGR I KTGD I LLNVDGVELT 
EVSRSEAVALLKRTSSSIVLKALEVKEYEPQEDCSSPAALDSNH 
NMAPPSCWSPSWVNWLELPRCLYNCKDIVLRRNTAGSLGFCIVG 
GYEEYNGNKPFFIKSIVEGTPAYNDGRIRCGDXLLAVNGRSTSG 
M IHACLARLLKELKGR I TLT I VS WPGTFL 


5802 


3 


290 


CFSLYQIMERIMDLPTLLRHAFREMFSVGGLFWMFRIRIILCLM 
GAFFY LI S P LD FV PEALFG I LG FLDD FFV I FLLL I Y I S I M YRE V 
ITQRLTR 


5803 


2234 


1299 


EAQFGTTAEIYAYREEQDFGIEIVKVKA1GRQRFKVLELRTQSD 
GIQQAKVQILPECVLPSTMSAVQLESLNKCQIFPSKPVSREDQC 
S YKWWQKYQ KRKFHCANLT S WPRWL YS L YDAETLMDR I KKQLRE 
WDENLKDDSLPSNPIDFSYRVAACLPIDDVLRIQLLKIGSAIQR 
LRCELDIMNKCTSLCCKQCQETEITTKNEIFSLSLCGPMAAYVN 
PHGYVHETLTVYKACNLNLIGRPSTEHSWFPGYAWTVAQCKICA 
SHIGWKFTATKKDMS PQKFWGLTRS ALLPTI PDTEDE ISPDKVI 
LCL ' 


5804 


2 


1707 


EMEKQRQEEQRKRTEEERKRRIEQDMLEKRKIQRELAKRAEQIE 
DINNTGTESASEEGDDSLLITWPVKSYKTSGKMKKNFEDLEKE 
RBEKERIKYEEDKRIRYEEQRPSLKEAKCLSLVMDDEIESEAKK 
ESLSPGKLKLTFEELERQRQENRKKQAEEEARKRLEEEKRAFEE 
ARRQMVNEDEENQDTAKIFKGYRPGKLKLSFEEMERQRREDEKR 
KAEEEARRRI EEEKKAFAEARRNMWDDDSPEMYKTISQEFLTP 
GKLEINFEELLKQKMEEEKRRTEEERKHKLEMEKQEFEQLRQEM 
GEEEEENETFGLSREYEELIKLKRSGSIQAKNLKSKFEKIGQLS 
EKEIQKKIEEERARRRAIDLEIKEREAENFHEEDDVDVRPARKS 
EAPFTHKVNMKARFEQMAKAREEEEQRRIEEQKLLRMQFEQREI 
DAALQKKREEEEEEEGS IMNGSTAEDEEQTRSGAPWFKKPLKNT 
S WDS E PVR FTVKVTG E P K P E I T WWFEGEI LQDG ED YQ Y I ERGE 
TYCLYLPETFPEDGGEYMCKAVNNKGSAASTCILTIESKN 


5805 


3 


776 


Y I S DTLGQVYKS K I RWW I E ENGGNGN I S VDDL I ALLDLAEHAS S 
AFKE S QQQS EDRE YE VKERL YPKS KRR YDTYN I AG YQGE I E VGL 
YTI QILQLI PFFDNKNELS KRYMVNFVSGSSDI PGDPNNEYKLA 
LKNYIPYLTKLKFSLKKSFDFFDEYFVLLKPRNNIKQNEEAKTR 
RKVAGYFKKYVDIFCLLEESQNNTGLGSKFSBPLQVERCRRNLV 
ALKADKFSGLLEYLIKSQEDAISTMKCIVNEYTFLLK 


5806 


1257 


877 


AVFTFHNHGRTANLYSLHSWLGITTVFLFACQRFLGFAVFLLPW 
ASMWLRSLLKPIHVFFGAAILSLSIASVISGINEKLFFSLKNTT 
RP YHS LPS EAVFANSTGML WAFGLLVLY I LIAS S WKRP 


5807 


2267 


1302 


RFS KKTFRR PMAVD I Q PAC LGLY CG KT LLFKNG S TE I YG E CG VC 
PRGQRTNAQKYCQPCTESPELYDWLYLGFMAMLPLVLHWFFIEW 
YSGKKSSSALFQHITALFECSMAAIITLLVSDPVGVLYIRSCRV 
Li^LSDWYTMLYNPSPDYVTTVHCTHEAVYPLYTIVFIYYAFCLV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan < Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
V^possible nucleotide insertion) 








LMMLLRPLLVKKI ACGLGKS DRFKS I YAALYF FP I LTVLQAVGG 
GLLYYAFPY I I LVLS LVTLAVYMS ASB I ENCYDLLVRKKRLI VL 
FSHWLLHA YG 1 1 S I S R VDKLE QDL P LLALVPTP ALF YLFTAKFT 
EPSRILSEGANGH 


5908 


2 


433 


SLPDSGVVEYLSNGGVADNHKDFGELRYNECLMNFSCNGKNGSS 
EGRITHGFQLKSAYENNLMPYTNYTFDFKGVIDYIFYSKTHMNV 
LGVLGPLDPQWLVENNITGCPHPHIPSDHFSLLTQLELHPPLLP 
LVNGVHLPNRR 


5809 




2422 


ILVPGFQG I LHPGVYCALQSQHQAQELVADIDECEVSGLCRHGG 
RCVNTHGS FE CYCMDGYLPRNGPE P FHPTTDATSCTE IDCGTPP 
EVPDGYIIGNYTSSLGSQVRYACREGFFSVPEDTVSSCTGIjGTW 
ES P KLHCQE I NCGNP PEMRHAI LVGNHS SRLGGVARY VCQEGFE 
S PGGKI TS VCTEKGTWRESTLTCTE I LTK INDVSLFNDTCVRWQ 
INS RRINP KI S YVI S I KGQRLDPMES VREETVNLTTDSRTPEVC 
LALYPGTNYTVNISTAPPRRSMPAVIGFQTAEVDLLEDDGSFNI 
S I FNETCLKLNRRSR KVGSEHMYQFTVLGQRWYLANFS HATS FN 
FTTREQVP WCLDLY PTTDYTVNVTLLRS PKRHS VQ IT I ATPPA 
VKQTI SNI SGFNETCLR WRS I KTADMEEMYLFHIWGQR WYQKEF 
AQEMTFNISSSSRDPEVCLDLRPGTNYNVSLRALSSELPWISL 
TTQITEPPLPEVEFFTVHRGPLPRLRLRKAKEKNGPISSYQVLV 
LPIiALQSTFSCDSEGASSFFSNASDADGYVAAELLAKDVPDDAM 
EIPIGDRLYYGEYYNAPLKRGSDYCIILRITSEWNKVRRHSCAV 
WAQVKDS S LMLLQMAGVGLGS LAW! ILTFLSFSAV 


5810 


3 


1641 


KVFGTHKDHE VSTLDTAI S AVKVQLAEFLENLQEKSLRI E AFVS 
E I ES FFNTI EENCSKNEKRLEEQNEEMMKKVLAQYDEKAQSFEE 
VKKKKMEFLHEQMVHFLQSMDTAKDTLETIVREAEELDEAVFLT 
SFEEINERLLSAMESTASIiEKMPAAFSLFEHYDDSSARSDQMLK 
QVAVPQPPRLEPQEPNSATSTTIAVYWSMNKEDVIDSFQVYCME 
E PQDDQEVNELVEEYRLTVKES YC I FEDLE PDRCYQW VMAVNF 
TGCSLPSERAIFRTAPSTPVIRAEDCTVCWNTATIRWRPTTPEA 
TETYTLEYCRQHS PEGEGLRS FSG I KGLOLKVNLQPNDNYFFYV 
RAINAFGTS EQSEAALISTRGTRFLLLRETAHPALHIS SSGTVI 
SFGERRRLTEIPSVLGEELPSCGQHYWETTVTDCPAYRLGICSS 
SAVQAGALGQGETSWYMHCSEPQRYTFFYSGIVSDVHVTERPAR 
VG ILLDYNNQRL I FINAESEQLLF 1 1 RHRFNEGVHPAFALEKPG 
KCTLHIjGIE ppds VRHK 


5811 


1918 


B51 


AAALADPLPEDKWSAEKRRPLKSSLGYEITFSLLNPDPKSHDVY 
WD I EGAVRRYVQP FLNALGAAGNFS VDSQ I LYYAMLGVNPRFDS 
AS S S YYLDMHSLPHVI NP VE SRLGS S AASLYPVLNFLLYVPELA 
HSPLYIQDKDGAPVATNAFHSPRWGGIMVYNVDSKTYNASVLPV 
RVEVDMVRVMEVFLAQIiRLLFGIAQPQLPPKCLLSGPTSEGIiMT 
WELDRLLWARSVENLATATTTLTSIAQLLGKISNIVIKDDVASE 
VYKAVAAVQKSAEELASGHIiASAFVASQEAVTSSEIiAFFDPSLL 
HLLYFPDDQKFAIYIPLFLPMAVPILLSLVKIFLETRKSWRKPE 
KTD 


5812 


5204 


2744 


GGRQRCQRGRSCGAREEEVEPGTARPPPAASAMDASLEKIADPT 
LAEMGKNLKEAVKMLEDSQRRTEEENGKKLISGDIPGPLQGSGQ 
DMVS ILQLVQNLMHGDEDEEPQSPRIQNIGEQGHMALLGHSIiGA 
YISTLDKEKLRKLTTRILSDTTLWLCRIFRYENGCAYFHEEERE 
GLAKICRLAIHSRYEDFWDGFNVLYNKKPVIYLSAAARPGLGQ 
YLCNQLGLPF PCLCR VPCNTVFGSQHQMDVAFLEKL I KDD I ERG 
RL PLLL VANAGTAAVGHTDKI GRL KE LCEQYG I WLHVEGVNLAT 
LALGYVSSSVLAAAKCDSMTMTPGPWLGLPAVPAVTLYKHDDPA 
LTLVAGLTSNKP7DKLRALPLWLS LQ YLGLDGFVER I KHACQLS 
QRLQES LKKVNY IKI LVEDELSSPVWFRFFQELPGS DPVFKAV 
PVPNMTPSGVGRERHSCDALNRWLGEQLKQLVPASGLTVMDLEA 
EGTCLRFSPLMTAAVLGTRGEDVDQLVACIESKLPVijCCrLQLR 
EEFKQEVEATAGLLYVDDPNWSGIGWRYEHANDDKSSLKSYPQ 
G EN I HAG LLKKLNE LESDLT F KI G P E Y KSMKS CL YVGMASDNVH 
AAELVET I AATARE I EDNSRLLENMTEWRKGIQEAQVELQKAS 
E ERLLEEGVLRQ I PWGS VLNW FS P VQ ALQ KGRT FNLTAGS LE S 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment: containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 

Glutamic Acid P-Ph*nvi ai /-> nt.. 
uiuwauut j-u_j.u, r - rnenyiaianine, G=Glycine, 

H=Histidine, I*Isoleucine, K=Lysine, 

L-Leucine, M=Methionine, N=Asparagine, 

P=Proline, Q«Glutamine, R-Arginine, 

S=Serine, T-Threonine, V^Valine, 

W-Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 

Codon, /-possible nucleotide deletion, 

\-possible nucleotide insertion) 








TEP I YVY KAQGAGVTL P PTPSGS RTKQRLPGQKP FKRS LRGS DA 

LSETSSVSHIEDLEKVERLSSGPEQXTLEASSTEGHPGAPSPQH 
TDQTEAFQKGVPHPEDDHSQVEGPESLR 


5813 


2936 


(J99 


HRDGVSGSLERPLTDRSRTGAFAQQRGKMATAGGGSGADPGSRG 
LIjRLLSPCVLLAGLCRGNSVERKIYIPLNKTAPCVRIjLNATHQI 
GCQSSISGDTGVIHWEKEEDLQWVLTDGPNPPYMVLLESKHFT 

rdlmeklkgrtsriaglavsltkpspasgfspsvqcpndgfgvy 

SNS YG PE FAHCRE I QWNS LGNGLAYEDFS F P I FLLEDENE TKVI 
KQCYQDHNLSQNGSAPTFPLCAMQLFSHMAWLSFSTAT\CMRRS 
SIQSTFSINPKIVCDPLSDYNVWSMLKPINTTGTLKPDDRWVA 
niKLiu&K&t tiftNV \APGAESAVAS FVTQLAAAEALQKAPDVTTL 
PRNVMFVFFQGETFDYIGSSRMVYDMEKGKFPVOLENVDSFVEL 
GQ VALRTS LE L WMHTD P VS QKNES VRNQVEDLLATLB KS GAGVP 
AVI LRRPNQ SQPLPPSS LQRFLRARN I SG WLADHSGAFHNKYY 
QSIYDTAENINVSYPEWLEPLKE/ETWNFG*QDTAKALADVATV 
LGRALYELAGGTNFSDTVQADPQTVTRLLYG\FLIKANNSWFQS 
I LQGRDLRS YLG * RGL FQH \ YIAV \ S S PTNT I YV/ VLQ YALANL 

TGTVVNLTREQCQDPSKVPSENKDLYEYSWVQGPLHSNETDRLP 
RCVRSTARLARALSPAFELSQWSSTEYSTWTESRWKDIRARIFL 
IAS KELEL I TLTVGFG I L I FSL I VTYCINAKADVLFI APRE PGA 
VSY 


5814 


8500 


432 


ALKCR P RR VLA I L VGP VQ PDRMAE EG A VA VC VR VRPLN'SRE ESL 

GETAQVYWKTHNNVIYPVDGSKSFNFDRVLHGNETPKNVYEA\I 

AAPIIDSAIQGYNGTIFA\YGQT\ASGKTYTMMGSEDHLGVIPQ 

GQFHGH FSQK I * E VFL DRE FLLR VS YME I YNET I TDLL CGTQKM 

KPL 1 1 RED VNRNV YVADLTE E WYTS EMALKW I TKGEKSRHYGE 

TKMNQRS SRSHTIFRMILES REKGE P SNCEG S VKVSHLNL VD LA 

GSERAAQTGAAGVRLKEGCNINRSLFILGQVIKKLSDGQVGGFI 

NYRDSKLTR I L QNS LGGNPKTRI I CTI TPVS FDETLTALQFAST 

AKYMKNTPYVNEVSTDEALLKRYRKE IMDLKKQLEEVSLETRAQ 

AMEKDQLAQLLEEKDLLQKVQNEKIENLTRMLVTSSSLTLQQEL 

KAKRKRRVTWCLGKINKMKNSNYADQFNIPTNITTKTHKLSINL 

LREIDESVCSESDVFSNTLDTLSEIBWNPATKLLNQENIESELN 

SIJiADYDNLVLDYEQLRTEKEEMELKLKEKNDLDEFEALERKTK 

KDQEMQL I HE I SNL KNLVKHRE VYNQDLENELSS KVELLRE KE D 

QIKKLQEYIDSQKLENIKMDLSYSLESIEDPKQMKQTLFDAETV 

AliDAKRESAFLRSENLELKEKMKELATTYKQMENDIQLYQSQLE 

AKKKMQVDLEKELQSAFNEITKLTSLIDGKVPKDLLCNLELEGK 

ITDLQKELNKEVEENEALREEVILLSELKSLPSEVERLRKEIQD 

KSEELHIITSEKDKLFSEWHKESRVQGLLEEIGKTKDDLATTQ 

SNYKSTDQEFQNFKTLHMDFEQKYKMVLEENERMNQEIVNLSKE 

AQKFDSSLGALKTELSYKTQELQEKTREVQERLNEMEQLKEQLE 

NRDSPLQTVEREKTLITEKLQQTLEEVKTLrQEKDDLKQLQESL 

QIERDQLKSDlHDTVNMNIDTQEQLRNALESLKQHQETrNTLKS 

KISEEVSRNLHMEENTGETKDEFQQKMVGIDKKQDLEAKNTQTL 

TADVKDNEIIEQQRKIFSLIQEKNELQQMLESVIAEKEQLKTDL 

KENIEMTIENQEELRLLGDELKKQQEIVAQEKNHAIKKEGELSR 

TCDRLAEVEEKLKEKSQQLQEKQQQLLNVQEEMSENQKXINEIE 

NLKNELKNKELTLEHMETERLELAQKLNENYEEVKS ITKERKVL 

KE LQKS FET ERDHLRG Y I RE I EATGLQTKEELKI AH I HL KEHQE 

TIDELRRSVSEKTAQI INTQDLEKSHTKLQEEIPVLHEEQELLP 

NVKKVSETQETMNEIiELLTEQSTTKDSTTLARIEMERLRLNEKF 

QESQEEIKSLTKERDNLKTIKEALEVKHDQLKEHIRETLAKIQE 

SQSKQEQSLNMKEKDNETTKIVSEMEQFKPKDSALLRIEIEMLG 

LSKRLQESHDEMKSVAKEKDDLQRLQEVLQSESDQLKEKIKEIV 

AKHLETEEELKVAHCCLKEQEBTIKELRVNLSEKETEISTIQKQ 

LEAINDKLQNKIQEIYEKEEQLNIKQISEVQEKVNELKQFKEHR 

KAKDSALQSIESKMLELTNRLQESQEEIQIMIKEKEEMKRVQEA 

LQIERDQLKENTKEIVAKMKESQEKEYQFLKMTAVNETQEKMCE 

I E H L KEQ FETQKLNLEN I ETEN I RLTQ I LHENLEEMRS VT KERD 

DLRSVEETLKVERDQLKENLRETITRDLEKQEELKIVHMHLKEH 
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Predicted end 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A«Alanine, C=Cysteine, D^Aspartic Acid, 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H*Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V- Valine, 
W-Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








QETIDKLRGIVSEKTNEISNMQKDLEHSNDALKAQDLKIQEELR 
IAHMHLKEQQETIDKLRGIVSEKTDKLSNMQKDLENSNAKLQEK 
IQELKANEHQLITLKKDVNETQKKVSEMEQLKKQ I KDQSLTLSK 
LEIENLNIAQKLHENLEEMKSVMKERDNLRRVEETLKLERDQLK 
ESLQETKARDLE I QQELKTARMLS KEHKETVDKLREKISEKT I Q 
I S D I QKDLDKSKDELQKKIQELQKKELQLLRVKED VNMSHKKIN 
EMEQLKKQFEPNYLCKCEMDNFQLTKKLHESLEEIRIVAKERDE 
LRR I KES LKMERDQF I ATLREM I ARDRQNHQ VKPEKRLLSDGQQ 
HLMESLREKCSRIKELLKRYSEMDDHYECLNRLSLDLEKEIEFH 
RI MKKLKYVLS YVTKIKEEQHBCINKFEMDFIDEVEKQKELLI K 

TE FQQVLS NR KEMTQ F LE EWLNTRFD I E KLKNG I Q KENDR I CQV 
NNFFNNRIIAIMNESTEFEERSATISKEWEQDLKSLKEKNEKLF 
KNYQTLKTS liAS GAQ VNPTTQDN KN PHVT S RATQLTTEKI RE LE 
NSLHEAKESAMHKESKI I KMQKELEVTNDI IAKLQAKVHESNKC 
LEKTKETIQVLQDKVALGAKPYKEE I EDLKMKU5KIDLEKMKNA 
KEFEKEISATKATVEYQKEVIRLLRENLRRSQQAQDTSVISEHT 
D PQP SNKPLTCGGGSG 1 VQNTKAL I LKSEHIRLEKE I S KLKQQN 
EQL I KQKNELLSNNQHLSNEVKTWKERTLKREAHKQVTCENSPK 
S PKVTGTAS KKKQITPSQCKERNLQDPVPKES PKSCFFDSRSKS 
LPS PH PVR YFDNS SLG LC P E VQNAGAE S VDS QP \ GP W ARL FQG K 
UVr 


5815 


23 


1460 


S ELVMWTVQNRES LGLLS FP VMI TMVCCAHS TNE PSNMS YVKET 
VDRLLKGYD I RLRPDFGG P PVDVGMR I DVAS I DMVSEVNMD YTL 
TM Y FQQ SWKDKRLSYSGIP LNLTLDNR VADQLWVP DT YFLNDKK 
S FVHGVTVKlfiiM IRLHPDGTVLYGLRI TTTAACMMDLRRYPLDE 
QNCTLE I ES YGYTTDD I E F YWNGGEG AVTGVNK I ELPQFS I VD Y 

SWVS FWINYDASAARVALGITTVLTMTTISTHLRETLPKIPYVK 
AIDI YLMGC FVFVFLALLE YAFVNY I FFGKG PQKKGAS KQDQSA 
NE KNKLEMNKVQ VDAHGN I LLSTLE I RNETSGSEVLTS VSD PKA 
TMYSYDSASIQYRKPLSSRE\A*GRAPDRHGVPSKGRIRRRAS\ 
QLKVK I PDLTDVNS I DKWS RMFFP IT FS LFNWYWLYYVH 


, 5816 


861 


191 


TVYHERQRLE LCAVHALNNVLQQQ L FS QEAADE I CKRLAPD S RL 
NPHRSLLGTGNYDVNVIMAAI/X3LGLAAVWWDRRRPLSQLALPQ 
VLGL I LNLPS PVS LGLLS LPLRRRHLRW PCARL / VTVS YYNLDS 
K\ LRAPEGPGG LRTE \ * G P FLAAALAQGLCEVLL WTKE VE E KG 
SWLRTD 


! 5817 


851 


118 


RLFRGPGANRGRSCRGCSGGREPSGGALPKRHCPC * PPSPPAAD 
VMSNTTVPNAPQANSDSMVGYVLGPFFLITLVGVVVAVVMYVQK 
XKRVDRLRHHLLPMYS YDPAEELHEAEQELLSDMGDP KW\ QAG 
RVATSTSGCHCWMSRRDLTPLPHPSEPGVLDCLGP CHLLPLLS P 
GSPCWVLGLHFSLHPPSAASASHALTITSLPPGLLPFVGVELTA 
HPQ ALMGRGF P S GMAAAGRHL CFL 


5810 


3 


3918 


QALRDKL W I FL VQS F YAVRHTES WKL MS TDDQQ K I Q AAAFD KGD 
DRRLG KKP I FSS S QQRKQ VS DSGDI KI KS WRGNNKKE CWS YLS T 
NKKMKSIX3LGASGHSSSTNRNSINKTLKQDDVKEKDGTKIASKI 
TKELKTGGKNVSGKPKTVTKS KTENGDKARLENMS PRQWERS A 
T AAAAATGQ KNLLNG KGVRNQ E GQ I SGARPKVLTGNLNVQAKAK 
PLKKATGKDS PCLS I AGPSSRSTDSSME FS I STECLDEPKENGS 
TEEEKPSGHKLS FCDS PGQMMKNS VDS VKNSTVAI KSRPVSRVT 
NGTSNKKSIHEQDTNVNNSVLKKVSGKGCSEPVPQAILKKRGTS 
NGCTAAQQRTKS T P SNLTKTQG S QGES PNS VKSS VS S RQS DENV 
AKLDHNTTTE KQAP KRKMVKQVHTALP KVNAKI VAMP KNLNQ S K 
KGETLNNKDSKQKMPPGQVISKTQPSSQRPLKHETSTVQKSMFH 
DVRDNKNKDS VSEQKPHKPL INLASEI SDAEALQSS CRP\DPQ K 
PLNDQE KEKLALECQNI SKLDKSLKHELESKQICLDKSETKFPN 
HKETDDCDAANI CCHS VGSDNVNS KFYS TTALKYKVSNPNENS L 
NSN P VCDLDS TSAGQ I HL I SDRENQVGRKDTNKQSS I KCVEDVS 
LCNPERTNGTLNSAQEDKKSKVPVEGLTI PS KLSDESAMDEDKH 
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Amino acid segment containing signal peptide 
<A-Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M-Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R.Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ATADSDVSSKCFSGGLSEKNSPKNMETSESPESHETPETPFVGH 
WNLS TG VLHQRES PES DTGSATTS SDD I KPRSEDYDAGGSQDDD 
GSNDRGISKCGTMLCHDFLGRSSSDTSTPEELKIYDSNLRIEVK 
MKJtUbbNDbrQVNSTSDDEIPRKRPEIWSRSAIVHSRERENIPR 
GSVQFAQE I DQVSS S ADETEDERSEAENVAENFS I SNP APQQFQ 
G I INLAFEDATENECREFS ANKKFKRS VLLS VDECEELGSDEGE 
VHTPFQASVDSFSPSDVFDGISHEHHGRTCYSRFSRESEDNILE 
CKQN KGNS VCKNES T VLDLSS I DS S RKNKQS VS ATEKKNT I D VL 
SS RS RQLLRE DKKVNNGSNVEND I QQRS K FLDS DVKS QER P CHL 
DLHQREPNSDIPKNSSTKSLDSFRSQVLPQEGPVKESHSTTTEK 
ANIALSAGDIDDCDTLAQTRMYDHRPSKTLSP I YEMDVI EAFEQ 
KVES E THVTDMDF* DDQH FAKQDWTLLKQLLS EQDSNLD VTNS V 
PEDLS LAQ YL INQTLLLARDSS KPQG I TH I DTLNRWS ELTSPLD 
SSAS ITMAS FSSEDCS PQGEWTILELETQH 


5819 


1 


5557 


AAAGL LCALHLVMTLVVAAARAE KE AFVQ S ES I IEVLRFDDGGL 
LQTETTLGLSSYQQKSISLYRGNCRPIRFEPPMLDFHEQPVGMP 
KMEKVYLHNPS S E+ TI TLVS I FATTSHFHAS FFQNRKI LPGGNT 
S FD VS / VFLAR VVGNVENTLF I NTSNHG V FTY \ Q VFGVGVPN P Y 
RLRPFLGARVTVNSSFSP I INIHNPHSEPLQWEMYSSGGDLHL 
ELPTGQQGGTRKIiWEIPPYETKGVMRASFSSREADNHTAFIRIK 
TNASDS TEFIIL PVEVE VTTAPG I YSSTEMLDFGTLRTQDL PKV 
LNLHLLNSGTKDVPITSVRPTPQ\NDAITVHFKPITLKAS\ESK 
YTKVAS I S PDAS KAK KPSQFSGKI TVKAKE KS YS KLE I P YQAE V 
LDG YLGFDHAATLFHIRDS PADP VERP I YLTNTFS FAIL I HDVL 
LPEEAKTMFKVHNFSKPVLILPNESGYIFTLLFMPSTSSMHIDN 
NILLITNASKFHLPVRVYTGFLDYFVLPPKIEERFIDFGVLSAT 
EASNILFAIINSNPIELAIKSWHIIGDG\LSIELVAVDRGNRTT 

iisslpecekssssdqssvtiasgyf\avfrvkltakkl\egih 

DGAIQITTDYEILTIPVK\AVIAVGSLTCSPKHWLPPSFPGKI 
VHQSLNIMNSFSQKVKIQQIRSLSEDVRFYYKRLRGNKEDLEPG 

kkskianiyfdpglqcgdhcwglpflsksepkvqpgvamqedm 
wdadwdlhqs lfkgwtg i kensghrlsai fevntdlqkni iski 
taelswpsilssprhlkfpltntncss\eeeitlenp/sqdvpv 
yvqfi plal ysnps vfvdklvs rfnls kvakidlrtlefqvfrn 
s ahplqsstgfmeg\ lsphl i lnli lkpgekks vkvk\ ftpvhn 
rtvssliivr1wltvmdavmvqgqgttenlrvagklpgpgsslr 
fki teallkdctds lklrepnftlkrtfkventgqlqihietie 
isgyscegygfkwncqeftlsanasrdiiilftpdftasrvir 
elkfittsgse fvf i lnas lp yhmlatcaeal pr pnwelal y 1 1 
isg imsalfllvigta\ yleaqgi wep \ frrrls \ feasnp pfd 
vgrp fdlrr i vg i s s egnlntls cd pg hs rgfcgagg s s s rpsa 
gshkq*gpsghphsshsnrnsadvddvraynsgrtssmtsaqaa 
s sq panktrp lvlds ntgaqghs agrks kgakqsqhgsqhhahs 
pleqhpqpplpppvpqpqepqperlspaplahpshperassarh 
ssedsdi tslieamdkdfdhhdspalevfteqpps plpkskgkg 
kplqrkvkppkkqeekekkgkgkpqede lkdsladdds s sttte 

I oNFUJ c. V DIE K.QKG KQAM P EKHE S EMSQV KQKS KKLLN I 

KKEIPTDVKPSSLELPYTPPLESKQRRNLPSKIPLPTAMTSGSK . 

SRNAQKTKGTSKLVDNRPPALAKFLPNSQELGNTSSSEGEKDSP 

PPEWDSVPVHKPGSSTDSLYKLSLQTLNADrFLKQRQTSPTPAS 

P S P PAAP CPFVARGS YS S I VNS S S S SD P K I KQPNGS KH KLTKAA 

S L PG KNGNPTFAAVTAG YDKS PGGNGFAKVS SNKTG FS S S LG I S 

HAPVDSDGSDSSGLWSPVSNPSSPDFTPLNSFSAFGNSFNLTGE 

VFSKLGLSRSCNQASQRSWNEFNSGPSYLWESPATDPSPSWPAS 

SGSPTHTATSVLGNTSGLWSTTPFSSSIWSSNLSSALPFTTPAN 

TLASIGLMGTENSPAPHAPSTSSPADDLGQTYNPWRIWSPTIGR 

RSSDPWSNSHFPHEN 


5820 


310 


1270 


RVS LSGPVS LGVLLCARS STMGKRDNRVAYMNP IAMARS RGP I Q 
S S G PTI Q \VI * I DQGL PGKK* KSN * KRKR K/ DS KALAE FE EKMN 
ENWKKELEKHREKLLSGSESSSKKRQPJCKKEKKKSW*\DSSSS\ 
SSSSDSSSSSSDSEDEDKKQGKRRKKKKNRSHKSSESSMSETES 
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Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E* 

*J4. ULdKULt. nLIU) -riiciiyxoiciiiXH" j U*U1 Jf ^- lllC | 

H=Histidine, I-Isoleucine, K=Lysine, 
L-Leucine, M=»Methionine , N»Asparagine , 
P= Proline, Q-Glut amine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X*Un)enovm, *=Stop 

PoHnn /=rw%Qts1 Y\1 e riiifl ent* i deletion 

\sspossible nucleotide insertion) 








DSKDSLKKKKXS KDGTEKEKD I KGLSKKRKMYS EDKPLS SESLS 
ESEYIEEVRAKKKKSSEEREKATEKTKKKKKHKKHSKKKKKKAA 
SS SPDS P * H * EKSGFP YKES AMS EE I STVKTTTYLLKCMNFLVF 
GI IPGLFSSHSDATV 


5821 


179 


915 


KWRNQ S WR W P KP GTNWM LS CS VC WRRVTWTGS VWMRKLG KH PQT 
PT/IKDCSIAATGKRPSARFPHQRRKKRREMDDGLAEGGPQRSN 
TYVIKLFDRSVDLAQFSENTPLYPICRAWMRNSPSVRERECSPS 
SPLPPLPEDEEG\SEVTNSKSR*CVQACPPTHTPGGQPKNACR\ 
SRIPSPLAALRMQGTP*RWSPFEPEPSPSTLIYRNMQRWKRIRQ 
RWKEASHRNQLRYSESMKILREMYERO 


5B22 


464 


4379 


QTLKEMPI VMARDLEETAS S S EDEEVISQEDHPC IMWTGGCRRI 
PVLVFHADA I LTKDNNIRVI GERYHLS YKI VRTDSRLVRS I LTA 
HGFHEVHPSSTDYNLMWTGSHLKPFLLRTLSEAQKVNHFPRSYE 
LTRKDRL YKN 1 1 RMQHTHGFKAFH I LPQTFLLPAE YAE FCNS YS 
KDRGP W I VKP VAS S RGRG \ VYL INN PNQI S LEENI LVS R Y I NNP 
LLIDDFKFDVRLYVLVTSYDPLVIYLYEEGLARFATVRYDQGAK 
N IRNQFMHLTNYS VNKKS GD YVS CDD P E VED YGNKWSMS AM LR Y 
LKQEGRDTTALMAHVEDLI I KTI I SAELAI ATACKTFVPHRS S C 
FELYGFDVLI DSTLKPWLLE VNLS PSLACDAPLDLKI KASM I SD 
MFTWGFVCQDPAQRASTRPIYPTFESSRRNPFQKPQRCRPLSA 
SDAEMKNLVGSAREKGPGKLGGSVLGLSMEEIKVLRRVKEENDR 
RGG F I R I FPTS ETWE I YG S YLEHKTSMNYMLATRLFQDRMT ADG 
APELKI * SLNSKAKLHAALYERKLLSLEVRKRRRRSSRLRAMRP 
KYPVITQPAEMNVKTETESEEEEEVALDNEDEEQEASQEESAGF 
LRENQAKYTPSLTALVENTPKENSMKVREWNNKGGHCCKLETQE 
L EP KFNLMQ I LQDNGNLS KMQ AR I AFS AYLQHVQ I \ RLM KDSGG 
QTFSASWAAKEDEQMELWRFLKRASNNLQHSLRMVLPSRRLAL 
LERTRILAHQLGDFIIVYNKETEQMAEKKSKKKVEEEEEDGVNM 
ENFQE F I RQ AS EAE LEEVLT F YTQ KNKSAS VFLGTHS KI S KNNN 
NYSDSGAKGDHPETIMEEVKIKPPKQQQTTEIHSDKLSRFTTSA 
E KEAKLVYSNS S SG PTATLQKI PNTHLSS VTTSDbS PGPCHHS S 
LSQIPSAIPSMPHQPTILLNTVSASASPCLHPGAQNIPSPTGLP 
RCRSGSHTIGPFSSFQSAAHIYSQKLSRPSSAKAGSCYIjNKHHS 
G I AKTQKEGEDAS LYS KRYNQSMVTAEIiQRLAEKQAARQYS PS S 
HINLLTQQVTNLNLATGIINRSSASAPPTLRPIISPSGPTWSTQ 
SDPQAPENHSSSPGSRSLQTGGFAWEGEVENNVYSQATGWPQH 
KYHPTAGSYQLQFALQQLEQQKLQSRQLLDQSRARHQAIFGSQT 
LPNSNLWTMNNGAGCRISSATASGQKPTTLPQKWPPPSSCASL 
VP KPPPNHEQVLRRATSQKAS KGSSAEGQLNGLQSSLNPAAFVP 
ITS STDPAHTKIMNHKHTEKQP VHHSWVHD 


5823 


42 


2293 


LLTALSMEGGGGRDEPS ACRAGDVNMDD PKKEDI IiLLADEKFDF 
D LS LSS S S ANEDDE VF FGP FGHKERC IAASLELNN P VPEQ P P LP 
T S ES PFAWS PLAGE KFVEVY KEAHLLALH I ES S S RNQAAQAAKP 
EDPRSQGVERFIQESKF\KINLFEKEKEMKKSPTSLKRETYYLS 
DS PLLGP PVGEPRLLAS S PALPS SGAQARLTRAPGP PHSAHALP 
RESCTARAASQAATQRKPGTKLLLPRAAS VRGRG I PGAAEKPKK 
h I FA&rbK I KIFAEKr>SnKDVUrlJKFAr\iAVNVFA^ 
RAIPVP\NKIiGLKKTLLKAPGSYSN\LQRKSSSGA\VWSGASSA 
CTPQPVAKAKSSE FAS I PAN * LPGLCPN I SKS \GRMGPAMLRPA 
Ti\ PAGPVG\ AS < ?WnAKT?VnVSETiAAP , nTiTAPP\ SASPTOPOTPE 
GGG \QWLNS S CAWSES SQLNKTRS I RRRDS CLNSKTKVMPTPTN 
QFKIPKFSIGDS\PDSSTPKLSRAQRPQSCTSVGRVTVHSTPVR 
RSSGPAPQSLLSAWRVSALPTPASRRCSGLPPMTPKTMPRAVGS 
PL\CVPARRRSSEPRKNSAMRTEPTRESNRKTDSR\LVDVSPDR 
GSPPSRVPQALNFSPEESDSTFSKSTATEVAREEAKPGGDAAPS 
EALLVDIKLEPLAVTPDAASQPLIDLPLIDFCDTPEAHVAVGSE 
SRPL I DLMTNTPDMNKNVAKPSP WGQL I DLS S PL I QLS PEADK 
ENVDSPLLKF 


5824 


42 


2293 


LLTALSMEGGGGRDE P S ACRAGD VNMDDPKKED I LLLADE KFDF 
DLS LSSSSANEDDEVFFG PFGHKERC I AASLELNNPVPEQP PLP 
TS ES P FAWS PLAGEKFVEVYKEAHLLALH I ES S SRNQAAQAAKP 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G*=Glycine, 
H=Histidine, I*Isoleucine, K-LyBine, 
L=Leucine, M^Methionine, N«Asparagine, 
P*=Proline, Q-Glutamine, R=Arginine, 
S-Serine, "^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EDPRSQGVERFIQESKF\KINLFEKEKEMKKSPTSLKRETYYLS 
DSPLLGPPVGEPRLLASSPALPSSGAQARLTRAPGPPHSAHAL? 
RES CTAHAASQAATQRKPGTKLLLPRAAS VRGRG I PGAAEKP KK 
E I PAS P S RTKI PAEKE S HRD VL PDKPAPGAVNVPAAGS HLGQGK 
RAIPVP\NKLGLKKTLLKAPGSYSN\LQRKSSSGA\VWSGASSA 
CTPQPVAKAKSSEFAS I PAN*LPGLCPNISKS\GRMGPAMLRPA 
L\PAGPVG\ASSWQAKRVDVSELAAEQLTAPP\SASPTQPQTPE 
GGG\QWLNSSCAWSESSQLNKTRSIRRRDSCLNSKTKVMPTPTN 
Q FKI PKFS IGDS \PDS STP KLS RAQRPQSCTS VGRVTVHSTPVR 
RSSGPAPQSLLSAWRVSALPTPASRRCSGLPPMTPKTMPRAVGS 
PL\CVPARRRSSEPRKNSAMRTEPTRESNRKTDSR\LVDVSPDR 
GSPPSRVPQALNFSPEESDSTFSKSTATEVAREEAKPGGDAAPS 
E ALLVD I KLE P LAVT P DAAS QPLIDLPLI DFCDTP EAHVAVGS E 
S RPLI DLMTNTPDMNKNVAKPS PWGQLI DLS SPL I QLS PEADK 
ENVDSPLLKF 


5825 


2 


4210 


FLQI E S AS PA? FS SG FLAAHPHS PGGSLATKGRSRLS APGMLHb 
SAAPPAPPPEVTATARPCLCSVGRRGDGGKMAAAGALERSFVEL 
SGAERERPRHFREFTVCS I GTANAVAGAVKYS ESAGGF YYVESG 
KL FS VTRNR F I HWKTS GDTLE LMEES LD I NLLNNAI RL KFQN CS 
VLPGGVYVSETQNRVI I LMLTNQTVHRLLLPHPSRMYRSELWD 
S QMQS I FTD IGKVDFTDPCNYQLI PAVPG IS PNSTASTAWLSSD 
GEALFALPCASGGIFVLKLPPYDIPGMVSVVELKQSSVMQRLLT 
GWM PTA I RGDQS P SDR P LS LAVHC VEHDAF I FALCQDH KLRMWS 
YKEQMCLMVADMLEYVPVKKDLRLTAGTGHKLRLAYSPTMGLYIj 
G I F \MHAPKRGQFCI FQLVSTESNR YSLDH I S SLFTSQETLID F 
ALTSTDIWALWHDAENQTWKYINFEHNVAGQWNPVFMQPLPEE 
EIVIRDDQDPREMYLQSLFTPGQFTNEALCKALQIFCRGTERNL 
DLS WSELKKEVTLAVENELQGS VTEYEFSQEE FRNLQQE FWCKF 
YACCLQYQEALSHPLALHLNPHTNMVCLLKKGYLSFLIPSSLVD 
HL YLL P YENLLTE DETT I S DDVD I ARD V I CL I KCLRL I EES VTV 
DMS VI M EMS CYNLQS P E KAAEQ I LEDM IT I DVENVMED I CS KLQ 
EIRNPIHAIGLLrREMDYETEVEMEKGFNPAQPLNIRMNLTQLY 
GSNTAGYIVCRGVHKIASTRFLICRDLLILQQLLMRLGDAVIWG 
TGQLFQAQQDLLHRTAPLLLSYYLIKWGSECIiATDVPLDTLESN 
LQHLS VLELTDSGALMANRFVSS PQT I VELFFQE VARKH 1 1 SHL 
FSQ P KAP LSQTG LNWPEM I TAI TS YLLQLLWP S NPGCL F LE CLM 
GNCQYVQLQDYIQLLHPWCQVNVGSCRFMLGRCYLVTGEGQKAL 
ECFCQAASEVGKEEFLDRLIRSEDGEIVSTPRLQYYDKVLRLLD 
VIGLPELVIQLATSAITEASDDW\KSQATL\RTCIFKHHL\DLG 
\HNSQAYGSL* PQI PDSSRQLDCLRQLVWLCERSQLQDLVEFS 
YVNLHNEWGI I ES RARAVDLMTHNYYELLYAFH r YRHNYRKAG 
TVMFEYGMRIjGREVRTLRGLEKOjGNCYIiAALNCljUjIRPEYAWI 
VQPVSGAVYDRPGAS PKRNHDGECTAAPTNRQ I E I LELEDLEKE 
CS LAR I RLTLAQHD P S AVAVAGS S S AEEMVTLL VQAGLFDTA I S 
LCQT FKL PLTP VFEG LAFKC I KLQ FGGE AAQAEAW AWLAANQLS 
SVTTTKESSATDEAWRIjIiSTYLERYKVQNNLYHHCVINKLLSHG 
VPLPNWL I NS YKKVDAAELLRL YLNYDLLDLTP YQ VIR ICG C 


5826 


3 


871 


KSQLLRDHSAPPPKPCTSVGAMGC*PRQ/SPKEQQRQLKKQKNR 
AAAQRSRQKHTDKADALHQQHESI^KDNLAIiRKEIQSLQAELiAW 
WSRTLHVHERLCPMDCASCSAPGLLGCWDQAEGLLGPGPQGQHG 
CREQLELFQTPGSCYPAQPLSPGPQPHDSPSLLQCPLPSLSLGP 
AWAE P P VQLS PS P LL FAS HTGS S LQGS S S KLS ALQ PS LTAQTA 
PPQPLELEHPTOGKIX5SSPDNPSSALGLARLQSREHKPALSAAT 
WQGLWDPS PHPLLAFPLLS SAQVHF 


5827 


194 


2287 


GMGSENSALKSYTLREPPFTLPSGLAVYPAVLQDGKFASVFVYK 
RENEDKVNKAAKVP * * HLKTLRHPCLLRFLSCTVEADGIHLVTE 
RVQPLE VALETLS S AE VCAG I YD I LLALI FLHDRGHLTHNNVCL 
SSVFVSEDGHWKLGGMETVCKVSQATPEFLRSIQS IRDPAS I P P 
EEMSPEFTTLPECHGHARDAFSFGTLVESLLTILNEQVSADVLS 
SFQQTLHSTLLNPIPKWRJALCTLLSHDFFRNDFLEVVNFLKSL 
TLKSEEEKTE FFKFLLDRVS CLS EELIAS RLVPLLLNQLVFAE P 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine / D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V-Valine, 
W-Tryptophan, Y-Tyrosine, X^Unknown, *=Stop 
Codon, /"possible nucleotide deletion, 
\=possible nucleotide insertion) 








VAV\KSFLPYLLGPKKDHAQGETPCLLSPALFQSRV I PVLLQLF 
E VH E EHVRM VL LS H I E A YVG AL»S LREQLKKV \ I L \ PQ VLLG \ LR 
D\TS DS I VA I TLHS LAVLVS LLG PEVWGGE RTK I FKRTAP \ S F 
TK\NTDLSLEGDPFSQPIKFPINGLSDVKNTSEDSETIFPSSSKK 
SEEWPDWSGPE\EPENQTVNI\QIWP\REP\CDDVKSQCTTLDV 
EESSWDDCEPSSLDTKVNPGGGITATKPVTSGEQKP I PALLS LT 
EESMPWKSSLPQKISLVQRGDDADQIEPPKVSSQERPLKVPSEL 
GIX3EEFTIQVKKKPVKDPEMDWFADMIPEIKPSAAFLILPELRT 
EMVPKKDDVS PVMQFSS KFAAAE ITEGEAEGWEEEGELNWEDNN 
W 


5B2e 


2 


257 


AREGGSLGAVAACGELSYSCDFCPARPHTSWLTRFVKMEFQAW 
MAVGGGSRMTDLTSS I PKPLLPVGNKPLI WYPLNLLERVGFEEV 
I WTTRD VQKALCAE FKMKMKPD I VCI PDDADMGTADS LRY I YP 
KLKTDVLVLSCDLITDVALHEVVDLFRAYDASLAMLMRKGQDS I 

T7t)VT)^nvr^VVV a\7TTnT?nPT^\7T1QTrtirRT.T.FMIkWK'QriT.r»FTrT ,UTV 
IS ir V r\^\Jt\\jrKC<r^r\v I1j\J1\lJ C JL\J V Uo 1 wi\_T\lJi_»r ilMPJ EiJ\U Ij U .Ej H l_i V J. A. 

G S I LQKH PR I R FHTG LVDAHLYCLKKY I VDFLMENG \ S ITS I RS 
BL\ I PYLV/RGKQFSSASSQQGTRKEKEGGSKGKRGLKSFRISY 
SFY*KEANYTGTGAPY\D\ACWI 


5829 


260 


1259 


PDGRL I VS CS EDKT I KI WDTTNKQCVNNFS DS VGFAN FVDFNP S 
GTC I AS AGS DQTVKWDVRVNKLLQHYQVHSGGVNCI S FHPSGN 
YLITASSDGTLKILDLLKGRLIYTLQGHTGPVFTVS FSKGGELF 
ASGGADTQVLLWRTNFDELHCKGLTKRNLKRLHFDS P PHLLDI Y 
PRTPHPHEEKVETVEDFFLHLLRLIQSLR* S ICRSLLPLLWISF 
LLILPQQQKPWGLCQTRVKRPVDIS*TLP*CHQNVCQQPRiCRK 
QKT*VTSPVKVK/VSIPLAVTDALEHIMEQLNVLrQTVSILEQR 


5830 


4495 


3139 


GGKMAAPEERDLTQEQTEKLLQFQDLTGIESMDQCRHTLEQHNW 
N I EAAVQDRLNE QEGVP SVFN P P P S RPLQVNTADHR I YS YWSR 
PQPRGLLGWGYYLIMLPFRFTYYTILDIFRFALRFIRPDPRSRV 
TDPVGDIVSFMHSFEEKYGRAHPVFYQGTYSQALNDAKRELRFL 
L VYLHGDDHQDSDEF CRNTLCA P E V I S LI NTRML FWACS TNKPE 
GYRVSQALRENTYPFLAMIMLKDRRE*PV\VGRLEGLI\QPDDL 
INQLT F I MDANQT YL VS E RLER E ERNQTQ VLRQQQDEAYLAS LR 
ADQEKERKKREERERKRRKKEEVOQQKLAEERRRQNLQEEKERK 
LECLPPEPSPDDPES VKI I FKLPNDSRVERRFHFSQSLTVIHDF 
LFSLKESP\EKFQIEA\NFPRR\VLPCIPSEE\WPNPPTLQE\A 
GLSHTEVLFVQDLTDE 


5831 


71 


2897 


FCSKDKCCLYLPDSINRSKSCTAKPGAHSQDRHAVMDSERQVKD 
TDD IES PKRS I RDSGYIDCWDSERSDSLSP PRHGRDDS FDSLDS 
FGSRSRQTPSPDWLRGSSDGRGSDSESDLPHRKLPDVKKDDMS 
ARRTSHGEPKSAVPFNQYLPNKSNQTAYVPAPLRKKKAEREEYR 
KSWS TATS PAG LGKKALQDYGPRT\PVS\DDAESTSMFDMRCEE 
EAAVQ PHSRARQEQLQL INNQLREEDDKWQDDLARWKSRKRS VS 
QDL I KKEEERKKMEKLLAGEDGTS ERRKS I KTYRE I VQEKERRE 
RELHEAYKNARSQEEAEG I LQQY I ERFTI S EAVLERLEMPKI LE 
RSHSTEPNLSSFLNDPNPMKYLRQQSLPPPKFTATVETTIARAS 
VLDTSMSAGSGSPSKTVTPKAVPMLTPKPYSQPKNSQDVLKTFK 

T rrv Vy T C XTVICVVX nrO P CT? V"ITT3 1? fOTV & D RUC T .T V C HM Wfl VA P VW 
v LAjiS V o VNbri 1 V rtK £■ Ctt JVCK£.i-r 1 v >\r/\no jd i r^i3\jrl r CAj V/vK V n 

GSPLELKQDNGS IEINI KKPNS VPQELAATTEKTEPNSQEDKND 
GGKSRKGNIELASSEPQHFTTTVTRCS PTVAFVE FPSSPQLKND 
VSEEKDQKKPENEMSGKVELVLSQKWKPKSPEPEATLTFPFLD 
KMPEANQLHLPNLNSQVDSPSSEKSPVTTPFKFWAWDPEEERRR 
QEKWQQEQERLLQERYQ\KEQDK\LKEE\WEKAQKEVEEEERRY 
YE EE P * 1 1 \ EDPWP FTVS S S S ADQLSTS SSMTEGS GTMNKI DL 
GNCQDE KQDRRWKKS FQGDDSDLLLKTRES DRLEE KGS LTEGAL 
AHSGNPVSKGVHEDHQLDTEAGAPHCGTNPQLAQDPSQNQQTSN 
PTHSS BDVKPKTL PLDKS INHQ I ES PSERRKS I SGKKLCSS CGL 
PLGKGAAMI IETLNLYFHIQCFRCX3\ ICKGQLGDAVSGTDVRIR 
NGLLNCNDCYMRS RS AGQPTTL 


5832 


2454 


829 


PGRRFRHGSCAFQKQCIMLHICQYFLQGECKFGTSCKRSHDFSN 
SENLEKLEKLGMSSDLVSRLPTI YRNAHD I KNKSSAPSRVPPLF 
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Amino acid segment containing signal peptide 
(A«Alanine, OCysteine, D«Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine , G=Glycine, 
H=HiBtidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P^Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W-Tryptophan # Y-Tyrosine, X -Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








VPQGTSERKDSSGSVSPNTLSQEEGDQICLYHIRKSCSFQDKCH 
RVHFHLPYRWQFLDRGKWEDLDNMELIEEAYCNPKIERILCSES 
ASTFHSHCLNFNAMT YGATQARRLSTAS S VTKP PHF I LTTDWI W 
YWS DE FGS WQE YG RQG TVHP VTTVS S S D VE KAYLAY / WYTG V* R 
PGSHLEVPGRKAQLRVRFQSLRSEKPGLWHN*KGLPQTQIR\AP 
QDVTTMQTCNTKFPGPKSIPDYWDSSALPDPGFQKITLSSSSEE 
YQKVWNLFNRTLPFYFVQKIERVQNLALWEVYQWQKGQMQKQNG 
GKAVDERQLFHGTSAI FVDAI CQQNFDWRVCX5VHGTS YGKGS YF 
ARDAAYSHHYSKSDTQTHTMFLARVLVGEFVRGNASFVRPPAKE 
GWSNAFYDSCVNS VSDPS I FVI FEKHQVYPEYVI QYTTS S KPSV 
TPS ILLALGSLFSSRQ 


5833 


170 


3289 


S I LCLLS P CWQ FGKPWS I LSSRS RHS PCTKKGWEGMRKHLHT "" 
RQGHK* VHVEI S KALW VYRDD YF I RHS I S VSAVI VRAW I THKYR 
GRDWNVKWEENLLHAVAKNYTLLQTIPPFERPFKDHQVCLEWNM 
GYIWNLRANRIPQCPLENDWALLGFPYASSGENTGIVKKFPRF 
RNRELEATRRQRMDYPVFTVSLWLYLLHYCKANLCGILYFVDSN 
EMYGTPSVFLTEEGYLHIQMHLVKGEDLAVKTKFIIPLKEWFRL 
DISFNGGQIWTTSIGQDLKSYHNQTISFREDFHYNDTAGYFII 
GGSRYVAG I EGFFGPLKY YRLRS LHPAQI FNPLLEKQLAEQ I KL 
YYERCAE VQ E I VS VY AS AAKHGGE RQE ACHLHNS YLDLQ RR YGR 
PSMCRAFPWEKELKDKHPSLFQALLEMDLLTVPRNQNESVSEIG 
GKIFEKAVKRLSSIDGLHQISSIVPFLTDSSCCGYHKASYYLAV 
FYETGLNVPRDQLQGMLYSLVGGQGSERLSSMNLGYKHYQGIDN 
YPLDWELSYAYYSNIATKTPLDQHTLQGDQAYVETIRLKDDEIL 
KVQT KEDGD V FM WLKHE ATRGNAAAQQRLAQML FWGQQG VAKNP 
E AA I E W YAKGALE TED PAL I YD YA I VLFKGQG VKKNRRLALE LM 
KKAAS KGLHQAVNGLGWY YHKFKKNYA\KAAKYWLKA\EE \MGN 
PDASYNLGVLHLDGIFPGVPGRNQTLAGEYFHKAAQGGHMEGTL 
WCS LYYITGNLETFPRDP E KA WWAKHVAE KNG YLGHVIR KGLN 
AYLEGSWEJEALLYYVLAAETGIEVSQTNLAHI CEERPDLARRYL 
GVNCVWRYYNFSVFQIDAPSFAYLKMGDLYYYGHQNQSQDLELS 
VQMYAQAALDGDSQGFFNLAIiLIEEGTIIPHHILDFLEIDSTLH 
SNNISILQELYERCWSHSNEESFSPCSLAWLYLHLRLLWGAILH 
SALI YFLGTFLLS I LIAWTVQYFQSVSASDPPPRPSQASPDTAT 
STAS PAVTPAADAS DQDQ PTVTNN PE PRG 


5834 


17 


4020 


RFRRGGGRVFPGAFPASPSDSLGQGNSQGPPRTPKPPRT/QECG~ 
SAAPGP I PGQS SS* VP LRLEQI QQ KADCPLS LE LAL KPRMAAQV 
TLEDALSNVDLLEELPLPDQQPCIEPPPSSLLYQPNFNTNFEDR 
NAFVTG I ARY I EQATVHS SMNEMLEEGQEYAVMLYTWRSCSRAI 
PQ VKCNEQPNRVE I YEKTVEVLE PE VTKLMNFMYFQRNAI ERFC 
GEVRRLCHAERRKDFVSEAYLITLGKFINMFAVLDELKNMKCSV 
KNDHSAYKRAAQFLRKMADPQSIQESQNLSNFIJVNHNKITQSLQ 
QQLE V I S G YE E LLAD I VNLCVD Y YENRM YLTPS E KHMLLKVMG F 
GLYLMDGSVSNIYKLDAKKRINLSKIDKYFKQLQWPLFGDMQI 
ELARYIKTSAHYEENKSRWTCTSSGSSPQYNICEQMIQIREDHM 
RF I S ELAR YSNS EVVTG SGRQEAQKTDAE YRKLFDLALQG LQL L 
SQWSAHVMEVYSWKLVHPTDKYSNKDCPDSAEEYERATRYNYTS 

EEKFALVE VIAM I kglqvlmgrmesvfnhairhtvyaalqdfsq 

VTLME PLRQAI KKKKNVIQSVLQAIRKTVCDWETGHEPFNDPAL 
RGEKDPKSG*DIKVPRRAVGPSSTQLYMVRTMLESLIADKSGSK 
KTLRS SLEGPTILD I EKFHRES FFYTHL INFSETLQQCCDLSQL 
WFREFFLELTMGRRIQFPIEMSMPWILTDHILETKEASMMEYVL 
YSLDLYNDSAHYALTRFNKQFLYDEIEAEVI^CFDOFVYKLADQ 
I FAYYKVMAGS LLLDKRLRSECKNQGATIHLPPSNRYETLLKQR 
HVQLLGRS I DLNRL I TQR VS AAMYKS LE LAIGR FES ED LTS I VE 
LDGLLE INRMTHKLLSR YLTLDGFDAMFREANHNVSAP YGR ITL 
HVFWE LNYDFLPNYC YNG S TNR FVRTVL P FSQE FQRDKQ PNAQP 
Q YLHGS KALNLAYS S I YGS YRNFVG P PHFQVI CRLLGYQG IA W 
MEELLKVVKSLLQGTILQYVKTLMEVMPKICRLPRHEYGSPGIL 
EFFHHQLKDIVEYAELKTVCFQNLREVGNAILFCLLIEQSLSLE 
EVCDLLHAAPFONI LPR VHVKEGERLDAKMKRLES KYAPLHLVP 
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Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysxne, 
L=Leucine, M=Methionine, N-Aeparagine, 
P-Proline, Q-Glutamine, R«Arginine, 
S^Serine, T;= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *aStop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LIERLGTPQQIAIAREGDLLTKERLCCGLSMFEVILTRIRSFLD 
DP I W RG P LP SNGVMHVDE CVE FHRLW S AMQFVYC I P VGTHE FTV 
EQCFGDGLHWAGCMIIVLLGQQRRFAVLDFCYHLLKVQKHDGKD 
E 1 1 KNVPLKKMVERIRKFQI LNDE I IT I LDKYLKSGDGEGTPVE 
HVRCFQPP IHQSLASS 


5835 


4209 


1904 


SGN I RMAOGSHQ I DFQVLHDLRQKFPE VPEWVSRCMLQNNNNL 
DACCAVLS QESTRYLYGEGDLNFS DDSG ISGLRNHMTSLNLDLQ 
SQNI YHHGREGSRMNGSRTLTHS I SDGQLQGGQSNSELFQQEPQ 
TAPAQVPQG FNVFGMS S S S G AS NS APHLGFHLG S KGTSSL S QQT 
PRFNPIMVTIAPNI QTGRNTPTSLHIHGVPPPVLNSPQGNS I YI 
RPYITTPGGTTRQTQQHSGWVSQFNPMNPQQVYQPSQPGPWTTC 
PASNPLSHTSSOOPNOOGHOTSHVYMP T S^PTT^OPPTTHQ <5(5<5 
SQSSAHSQYNIQNISTGPRKNQIEIKLEPPQRNNSSKLRSSGPR 
TSSTSSSVNSQTLNRNQPTVYIAASPPNTDELMSRSQPKVYISA 
NAATGDEQVMRNQPTLF I S TNSGASAASRNMSGQVSMGPAF IHH 
HPPKSRAIGNNSATSPRVWTQPNT\EYTFKITVSPNKPPAVSP 
G WS PT FE LTNL LNHPDH YVETEN I HHLTDPTLAHVDR I S ETRK 
LSMGSDDAAYTQD I * RISNS WLGMVAHACNSSALGGQDGR 1 1* A 
QEFETSWGNIWRLRLYRRF*NYAGMVAHTCSPSYSVD*ALLVHQ 
KARMERLQRELE I QKKKLDKLKS EVNEMENNLTRRRLKRSNS I S 
QI PSLEEMQQLRSCNRQLQIDIDCLTKE I DLFQARG PHFNP S AI 
HNFYDN I G FVGPVP P KPKDQRS 1 1 KTPKTQDTEDDEGAQWNCTA 
CTFLNHPALIRCEQCEMPRHF 


5836 


361 


2303 


FHITMCGICCSVNFSAEHFSQDLKEDLLYNLKQRGPNSSKQLLK 
SDVNYQCLFSAHVLHLRGVLTTQPVEDERGNVFLWNGEIFSGIK 
VEAEENDTQILFNYLSSCKNESEILSLFSEVQGPWSFIYYQASS 
H YLWFGRDF FGRRS LLWHFSNLGKS FCLS S VGTQTSGLANQWQE 

vpas \ dfs elilsliisfpdalfync i lgni flgr i llkkmli a* 
vkfqqtyqhlyqr*qmkpncilknllfl*i*cx:hklhwrliavi 
fpmchlqeryfks fllmyt* keviqqfidvlsvavkxrvlclpr 
demtanevlktctjrkanvailfsggidsmviatiadrhiplde 
p i dllnvaf i ae e ktmpttfnregnkqknkce ips e e fs kd vaa 
aaads pnkhvsvpdritgraglkelqavs psriwnfveinvsme 
elqklrrtr ichl irpldtvldds igcavwfasrg i gwlvaqeg 

VKS YQSNAKWLTG IGADEQLAG YSRHRVRFQSHGLEGLNKE IM 
MELGR I S S RNLGRDDRVIGDHGKEARFP FLDENWS FLNSLP I W 
EKANLTLPRGIGEKLLLRLAAVELGLTASALLPKRAMQFGSRIA 
KMEKINEKASDKCGRLQIMSLENLSIEKETKL 


5837 


4792 


903 


NGNAVAQAPVTNCCYLATGS KDQTIR I WS CSRGRGVM I LKLPFL 
KRRGGG I D P TVKERLWLTLHWPSNQ PTQLVS S CFGGELLQWDLT 
QSWRRKYTLFSASSEGQNHSRIVFNLCPLQTEDDKQLLLSTSMD 
RD VKCWDI ATLECS WTLPSLGGFAYSLAFS S VDIGSLAI GVGDG 

mirvwntls i knnydvknfwqgvkskvtalcwhptkegclafgt 
ddgkvglydtysnkppqisstyhkktvytlawgppvppmslgge 
gdrpslalys cx3geg i vlqhnpwklsgeafdinkli rdtns i ky 
klpvhtei s wkadg kimalgnedgs i ei fq \ i pnlkl i ctiqqh 
hkl vnt i s whhe \ hgs paqkls yl \ mpsgs qqcspftchnlknc 
p*kaapespsdplqspyrtppoghtaqdypvwawephih*wegl 
vfcfpidgyspgcwd\afpgkeapvaifrg\hqqrllcvawspl 
dpdciysg\addfcvhkwltsmqdhsrppqgkksielekkrlsq 
pkakpkkkkkptlrtpvkles idgneeesmkensgpvengvsdq 
egeeqarepelpcgiiapavsrepvictpvssgfekskvtinnkv 
illkkeppkekpetlikkrkarsllplstsldhrskeelhqdcl 
vlatakhsrelnedvsadveerfhlglftdratlyrmidiegkg 
hlengh pel fh qlm lw kgdl kgvlq taaergeltdnl vamap aa 

G YHVWLWAVEAFAKQLC FQDQYVKAASHLLS IHKVYEAVELLKS 
NHFYREAIAIAKARLRPEDPVLKDLYLSWGTVLERDGHYAVAAK 
CYLG ATCAYDAAKVLAKKGDAAS LR TAAELAAI VGEDE LS AS LA 
LRCAQELLIJVNNWVGAQEALQLHESLQGQRLVFCLLELLSRHLE 
E KQLS EGKSSSS YHTWNTGTEGPFVERVTAVWKS I FS LDTPEQY 
QEAFQKLQNI KYPSATNNTPAKQLLLHI CHDLTLAVLSQQMASW 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K»Lysine, 
L-Leucine, M»Methionine, N-Asparagine , 
P»Proline, Q=Glut amine, R=Arginine, 
S=Serine, T^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DE AVQ ALLRAWR SYDSGSFTI MQ E VYS AFLPDGCDHLRD KLGD 
HQSPATPAFKSLEAFFLYGRLYEFWWSLSRPCPNSSVWVRAGHR 
TLSVEPSQQLDTASTEETDPETSQPEPNRPSELDLRLTEEGERM 
LSTFKELFSEKHASLQNSQRTVAEVQETLAEMIRQHQKSQLCKS 
TANGPDKNEPEVEAEQPLCSSQSQCKEEKNEPLSLPELTKRLTE 
ANQRMAKFPES I KAWP FPDVLECCLVLLL I RSHFPGCLAQEMQQ 
QAQE LLQKYGNT KTYRRH CQT FCM 


5838 


110 


98 


KTMPHLLVTFRDVAIDFSQEEWECLDPAQRDLYRJDVMLENYSNL 
ISLDLESSCVTKKLSPEKEIYEMES\PSGRIWGNVSTITFQYNG 
LG DNM E CKGNLEGQVS KS EGLYMCVK I TC E E KATE SHSTS S TFH 

RNT FS KKPS Y I * HQ \ KFRLGE KP YE CM ECGKAFGRTS DL I QHQK 
IHTNE KPYQCNACGKAF I RGSQLTEHQRVHTGEKP YDCKKCGKA 
FSYCSQYTLHQRIHSGEKPYECKDCGKAFILGSQLTYHQRIHSG 
EKP YE CKECGKAF I LG SHLTYHQRVHTGE KP Y I CKECG KAFLCA 
SQLNEHQRIHTGEKPYECKECGKTFFRGSQLTYHLRVHSGERPY 
KCKE CGKAFI SNSNL I QHQRIHTGE KPYKCKE CGKAF I CGKQLS 
EHQRI HTGEKPFE CKE CGKAFIRVAYLTQHEKI HGEKHYE CKEC 
GKTF VRATQLTYHQR I HTGEKPYKCKECDKAF/ HLWLT I LS EHQ 
RIHRGEKPYECKQCGR/LFIRGSHL/NEHLRTHTGEKPYECKEC 
GRAFSRGSEHTLHQRIHTGEKPYTCVQCGKDFRCPSQLTQHTRL 
HN * E YS SHKI CMHS IALAS LDFAHLQE KNPEN 


5839 


1 


2425 


GRPFPRPPRALPRLPLRGRRQDGRWTVDFEECLKD\SPRFRAAL 
EEVEGDVAELELKL\DKLVKLCIA\M I DTGKAFCVANKQFMNG I 
RD\LAQNS\NNDA\ WETKFAPS FLDS LQEM INFHTIL/ L * PNS 
E IN*GHS FQNFVKEDLRKFKDAKKQFENSQ* KRKKIALVKNAPV 
PSRPAS LEL* KP PNI LTATRKCFRH IALDYVLQINVLQSKRRSE 
I LKSMLS FMYAHLAFFHQG YDLFSELG P YMKDLGAQLDRLVGDA 
AKEKREMEQKHSTIQQKDFSRDDSKLKYNVDAANGIVMEGYLFK 
RAbWAr KlTNWKKWi 1 blUWWUV VYUlUtf KDNP rVVVfcDljKJjL. I VK 
HCEDI ERRFCFEWSPTKS CMLQADSEKLRQAWI KAVQTS I \AT 
AYRBKDDESEKLDKKSSPSTGSLDSGNESKEKLLKGESALQRVQ 
CIPGNASCCDCGLADPRWASINLGITLCIECSGIHRSLGVHFSK 
VRSLTLDTWEPELLKLMCELGNDVINRVYEANVEKMGIKKPQPG 
QRQEKEAYIRAKYVERKFVDKIFL*SLSPP\EQQKK\FVSKSSE 
EKRLSISKPGP\GDQVRASAQSSVRSNDSGIQQSSDDGRESLPS 
TVS AN S LYE PEGE RQDS S M FLDS KHLNPGLQL YRAS YEKNL P KM 
AE ALAHG ADVNWANS EENKAT PL I QAVLGGS LVTCE FLLQNGAN 
VNQRD VQGRG P LHHATVLGHTGQVCLFL KRGANQHATDE EG KD P 
LS IAVEAANADI VTLLRLARMNEEMRESEGLYGQPGDETYQDI F 
RDFSQMASNNPEKLNRFQQDSQKF 


5840 


698 


3610 


KHLHLPRQHLTTLWQI SS P RWRS P QRAFMS ALS KTQTQSAP ALQ 
GLSS LLQSVTGNP VPASEAASQSTS AS PANTTVYTI KGRNLPSS 
AQPFI PKSFNYSPNSSTSEVSSTSASKAS IGQS PGLPSTAFKLP 
SNTKGFTATHNTSPAAPPTEVTICQSSEVSKPKL\ESESTSPSL 
\EMKIHNFLKGNPGFSVA+NLKHPNPAGSLGSSAPSESHPSDFQ 
RGPTSTSIDNIDGTPVRDERSGTPTQDEMMDKPTSSSVDTMSLL 
SKIISPGSSTPSSTRSPPPGRDESYPRELSNSVSTYRPFGLGSE 
SPYKQPSDGMERPSSLMDSSQEKFYPDTSFQEDEDYRDFEYSGP 
P P S AMMNLQKKPAKS I LKS S KLSD TTE YQ P ILS S YSHRAQE FG V 
KSAFPPSVRALLDSSENCDRLSSSPGLFGAFSVRGNEPGSDRSP 
SPSKNDSFFTPDSNHNSLSQSTTGHLSLPQKQYPDSPHPVPHRS 
LFS PQNTLAAPTGHP PTSGVEKVLAST I S TTST I E FKNMLKNAS 
RXPSDDKHFGQAPSKGTPSDGVSLSNLTQPSLTATDQQQQEEHY 
RIETRVSSSCLDLPDSTEEKGAPIETLGYHSASNRRMSGEPIQT 
VE S I R V PG KGNRGHGREAS R VG WFDLS TS GSS FDNGPSS AS E LA 
S LGGGG S GGLTG F KT AP YKE RAPQ FQES VGS FRS NS FNSTFEHH 
LPPSPLEHGTPFQREPVGPSSAPPVPPKBHGGIFSRDAPTHLPS 
VDLSN P FTKE AALAHAAP PP P PGEHSG I PFPTPP P PPPPGEHSS 
SGGSG VP FSTPPP P P PPVDHS GWP FPAP PLAEHG VAGAVAVFP 
m-lSSLLQGTLAEHFGVLPGPRDHGGPTQRDLNGPGLSRVRESL 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
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corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine . CsCvsteinp n-tonart-i j-> h c_ 
Glutamic Acid, F« Phenyl alanine, G-Glycine, 
H^Histidine, I=Isoleucine f K=Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q-Glutamine, R-Arginine, 
S=Serine, T-Threonine, V=Valine, 
W«Tryptophan, Y=Tyrosine, X= Unknown, +=Stop 
Codon, /"possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLPSHS LEHLGP PHGGGGGGGSN^S SG P PLGPSHRDTI S RSGI I 
LRS PRP D FR PRE P FLS RDP PHS LKRPR P P FARGP P F FAP KR P FF 
PPRY 


5841 


1908 


762 


GLRLFLVLTVWPMMKPSWLSRTEFSKRLLCRTLWCQSGWSSRSY 
TRSMLKMTTSINRRSRTSTKSTRTSARPGLTATVSIGLSDSPTW 
RHCWMTARS CSGEKGGHWAPRQ VGVYLL PGR VGCVSSRVSPS FP 
GDGLDSGLARRGSAVSALASGLVEEPMLGPPFHPTPRFKAVSAK 
SKEDLVSQGFTEFTI EDFHNTFMDLI EQVEKQTSVADLLASFND 
Q ST S D YL WY LRLLTS G YLQRBS KFFEHF I EGGRTVKE FCQ \Q E 
\VEPMCKESDHIHII ALAQGLQR VH PGWB YMG P RPRAATTNPH I 
FP*GLPSPKVYLLYRPG\HYDILYKIGLGSSPLGCPGCPLLARA 
LGHCYRGFSVWKWSYFTPFFLSHDPPPMFY 


5842 


307 


1918 


QEPTADFKLRS TCGCGREMTCPDKPGQL INWFI CSLCVPRVRKL 
noiaRnrKjinKiNiibijo jalai r JjVSQVGRASIjQHGQAAEKGP 
HRS RDTAEPS FPE I PLDGTLAP PES QGNGSTLQPNWY I TLRS K 
RSKPANIRGTVKPKRRKKHAVASAAPGQEALVGPSLQPQEA\EG 
KLML*HLGTLREQTWLRLESDPGGWCGVRE/WRAGGPDFLQPSS 
RESNIRI YS ESAPS WLS KDDIRRMRLLADSAVAGLRPVS S RSGA 
RLLVLEGGAPGAVLRCGPS PCGLLKQPLDMS EVFAFHLDR I LGL 
NRTLPSVSRKAEFIQDGRPCPIILWDASLSSASNDTHSSVKLTW 
GTYQQLLKQ KCWQNGR VPKPESG CTE I HHHE WS KMAL FD FLLQ I 
YNRLDTNCCGFRPRKFnAPVnNCT.PBvnnTvv'CA&r AUTTnnvn 

DFRHLVFIDNKGFFDRSEDNLNFKLLEGIKEFPASAVYVLKSQH 
LRQ KLLQS L FLD KG YWE SQGGRQG I E KLIDV I EHRAKI L I T Y I N 
AHGVKVLPMNE 


5843 


500 


1453 


GTARLVTCWVLHGQ*VKKPAWEPGWWL*Q*RCRPKGWGLGAGM 
RGSRMS Q P PQ CLRRAQS S CCHFMVKLLDDGT FM I PGE KVAHTS L 
DALVTFHQQKP I EPRRELLTQPCRQKDPANVDYEDLFLYSNAVA 
E EAACP VS AP E EAS PKP VLCHQS KE RKP S AEM / RQNNHQG S HF L 
LPPKIPSWRDPPETLEEPQNAPRERPEGPAAAKKPPRHCELWT 
LGCPEIHGDLRPWDRKRQPRSLRGSHLGGQRLHGSLCGHISQKP 
LTAPGTKRQKGPHQEGREVGQLH+GDPRGQELAPNGSESPILPG 
VQARAPGLGRA 


5844 


202 


2471 


FDS AVLS S I NVMAVIj PG PIiQLLG VLLT ISLSSIRLI Q AGAY YG I 
KPLPPQI PPQMPPQI PQYQPLGQQVPHMPLAKDGLAMGKEMPHL 
QYGKEYPHLPQYMKEIQPAPRMGKEAVPKKGKEIPLASLRGEQG 
PRGEPGPRGPPGPPGLPGHGIPGIKGKPGPGGYPGVGKPGMPGM 
PG KPGAMGMPGAKG E IGQ KGE I GPMG I P * PQGP PG PHGLPG I GK 
PGGPGLPGQPGPKGDRGPKGLPGPQGLRGPKGDKGFGMPGAPGV 
KGPPGMHGPPGPVGLPGVRKPOVTfJKPriD\nri>Tr^v\ Dr-Tvnr-cn 

GPQGP IGVPGVQGPPGIPG IGKPGQDG\ I PGQPGFPGGKGEQGL 
PGLPGP PGLPG I GKPGFPGPKGDRGMGGVPGALGPRGEKGPI GA 
PGIGGPPGEPGLPGIPGPMGPPGAIGFPGPKGEGGIVGPQGPPG 
PKGEPGLQGFPGKPGFLGEVGPPGMRGFPGPIGPKGEHGQKGVP 
GLPGVPGLLGPKGEPGIPGDQGLQGPPGIPGIGGPSGPIGPPGI 
PGPKGEPGLPGPPGFPGIGKPGVAGLHGPPGKPGALGPQGQPGL 
PGPPGPPGPPGPPAVMPPTPPPQGEYLPDMGLGIDGVKPPHAYG 
AKKGKNGG P AYEM P AFTAELTAP FPP VGAP VKFNKLLYNGRQNY 
NPQTGIFTCEVPGVYYFAYHVHCKGGNVWVALFKNNEPVMYTYD 
EYXKGFLDQASGSAVLLLRPGDRVFLQMPSEQAAGLYAGQYVHS 
SFSGYLLYPM 


5845 


215 


2061 


HASNKSASLQDKKANPKEKTAMCLVNEIARFNRVQPQYKLLNER 
GP AHS KMFS VQLS LGEQTWES EGS S I KKAQQAVGNKALTES TLP 
KPI # KPPKSNVNNNPGCITPTVELNGLAMKRG\KPAIHRPLDPK 
PFPNNRANYNFQVM YNQRYHCP I PKI FYVQLTVGNNEFFGEGKT 
RQAARHNAAMKALQALQNEPI PER5 PQNGESGKDMDDDKDANKS 
E I SLVFE IALKRNMPVS FEVIKESGPPHMKS FVTRVS VGEFSAE 
GEGNSKKLSKKRAATTVLQELKKLPPLPWEKPK\HFFKKRPKT 
I VKAG PEYGQGMNP I S RIAQIQQAXKEKEPDYVLLSERGMPRRR 
EFVMQVKVGNEVATGTGPNKKIAK1WAAEAMLLQLGYKASTNLQ 
DQLEKTGBNKGWSGP KPG FPEPTNNTPKG I LHLS PDVYQEMEAS 
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amino acid 
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Predicted end 
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location 
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amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A-Alanine, C-Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F«Phenyl alanine, G=Glycine, 
H=Histidine / I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /«poseible nucleotide deletion, 
\-possible nucleotide insertion) 








KHK V 1 SO 1 1 Ui Y La FltDMNgPSSS FFS I S P TSNS SAT I ARELLM 

NGTSSTAEAIGLKGSSPTPPCSPVQPSKQLEYLARIQGFQVHYC 

DRQSGKECVTCLTLAPVQMTFHAIGSSIEASHDQV*YATAILLC 

YGPARKWKArKMEAMCAHAALLSLIHYLIAPSARLEKSKLFALG 
N- 


5846 


1126 


456 


FSKLIMKTFIIGISGVTNSGKTTLAKNLQKHLPNCSV1SQDDFF 
KP E S E IE TDKNG FLQ YD VLEALNKEKMMS A I S CWMB S ARHS WS 
TDQESAEEIPILIIEGFLLFNYKPLDTIWNRSYFLTIPYEECKR 
RRSTRVYQPPDSPGYFDGHVWPMYLKYRQEMQDITWEWYIoDGT 
KSEEDLFLQVYEDLIQELAKQKCLQVTA*RRNTTNPS /CK+ IRK 
LQGVI 


5847 


2769 


505 


APEMEDLSSPDSTLLQGGHNLLSSASFQESVTFKDVIVDFTQEE - 
WKQLDPGQRDLFRDVTLENYTHLVSIGLQVSKPDVISQLEQGTE 
PWIMEPSIPVGTCADWETRLENSVSAPEPDISEEELSPEVIVEK 
HKRDDSWSSNLLESWEYEGSLERQQANCX3TLPKEIKVTEKTIPS 
WEKGPVNNEFGKSVNVSSNLVTQEPSPEETSTKRSIKQNSNPVK 
KEKSCKCNECGKAFSYCSALIRHQRTHTGEKPYKCN* /CVEKAF 
S RS ENL INHQR IHTGDKP YKCDQCGKG FIEGPS LTQHQR I HTGE 
K P Y KCDECG KAFS QRTHLVQHQR I HTG EKP YTCNECG KAFSQRG 
HFNEHQKIHTGEKPFKCDECDKTFTRSTHLTQHQKIHTGEKTYK 
CNECGKAFNGPSTFIRHHMIHTGEKPYECNECGKAFSQHSNLTQ 
HQ KTHTGE KP YDCAECGKS FS YWS SLAQHLKIHTGEKP YKCNEC 
GKAFSYCSSLTQHRRIHTREKPFECSECGKAFSYLSNLNQHQKT 
HTQEKAYECKECGKAFIRSSSLAKHERIHTGEKPYQCHECGKTF 
"SYGSSLIQHRKIHTGERPYKCNECGRAFNQNIHLTQHKRIHTGA 
KPYECAECGKAFRHCSSLAQHQKTHTEEKPYQCNKCEKTFSQSS 
HL TQHQR I HTG E KP YKCNE CD KAFS RS THLTQHQR I HTG E KP YK 

CNECGK\TFSQSTYLIQHQRIHSGEKPFGCNDCGKSFRYRSALN 
KHQRLHPGI 


584B 


22 


2961 


AA PRR LLRGGDGDRTPR FPL PALLR PG P P AEAAP ERRKM P A VS K 
GDGMRGLAVFISDIRNCKSKEAEIKRINKELANIRSKFKGDKAL 
DG YS KKKYVCKLLFI FLLGHD IDFGHNEAVNLLS SNR YTEKQ IG 
YLFISVLVNSNSELIRLINNAIKNDLASRNPTFMGLALHCIASV 
GSREMAEAFAGE I PKVLVAGDTMDS VKQSAALCLLRLYRTS PDL 
VPMGDWTSRWHLLNDQHLGWTAATSLITTLAQKNPEEFKTSV 
S LAVS RLS \ R I VT S AS TDLQD YTY * FC PG FLGL S VKLLRLLQCY 
PPPDPAVRGRLTECLETILNKAQEPPKSKKVQHSNAKNAVLFEA 
I S L 1 1 HHD S E PNLL VRACNQLGQ FLQHRETNLR YLALE SMCTLA 
SSEFSHEAVKTHIETVINALKTERDVSVRQRAVDLLYAMCDRSN 
APQIVAEMLSYLETADYSIREEIVLKVAILAEKYAVDYTW\YVD 
TILNLIRIAGDYVSEEVWYRVIQIVINRDDVQGYAAKTVFEALQ 
APACHENL VKVGG Y I LGEFGNL IAGDPR S S P LI Q FHLLHS KFHL 
CS VPTRALLLSTYI KFVNLFPEVXPTIQDVLRSDSQLRNADVEL 
QQRAVE YLRLS TVASTD I LATVLE EMP P FPERES S I LAKLKKKK 
GPSTVTDLEDTKRDRSVDVNGGPEPAPASTSAVSTPSPSADLLG 
LGAAP PAP AGP P PS S GGSGLLVDVFS DS AS WAPLA PG S EDNFA 
RFVCKNNGVLFENOLLQIGLKS E FRQNLGRMFI FYGNKTSTQFL 
NFTPTLICSDDLQPNLNLQTKPVDPTVEGGAQVQQWNIECVSD 
FTEAPVLNIQFRYGGTFQNVSVQLPITLNKFFQPTEMASQDFFQ 
RWKQLSNPQQEVQNI FKAKHPMDTEVTKAKI IGFGSALLEEVDP 
NPANFVGAG I I HTKTTQ IGCLLRLEPNLQAQMYRLTLRTS KEAV 
SQRLCELLSAQF 


5849 


3545 


1895 


KRRE I KE T V FHH VAQAGLE LLS S S N P P S S AS RS AG I TG MRHQVQ 
P*DPCMSLSPPCFTEEDRFSLEALQTIHKQMDDDKDGGIEVEES 
DEFrREDMKYKDATNKHSHI*HREDI3ITIEDLWKRWKTSEVHNW 
TLEDTI^WLIEFVELPQYEKNFRDNNVKGTTLPRIAVHEPSFMI 
SQLKI S DRSHRQKLQLKALD WLFGPLTRP PHNWMKDF I LTVS I 
VI GVGGCWFAYTQNKTS KEHVAKMMKDLES LQTAEQSLMDLQER 
LEKAQEENRNVAVEKQNL* RKMMDE INYAKEEACRLRELREGAE 
CELSRRQYAEQELEQVRMALKKAEKEFELRSSWSVPDALQKWLQ 
LTHEVEVQ YYN I KRQNAEMQLAIAKD EAE K I KKKRS TV FG TLHV 
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amino acid 
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Predicted end 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine / C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M-Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V-Valine, 
W-Tryptophan, Y-Tyrosine, X-UnJcnown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








AHS S SLDE VDHK I L E AXKALS E LTTCLRERLFR WQQ I E KI CG FQ 
IAHNSGLPSLTSSLYSDHSWWMPRVSIPPYPIAGGVDDLDEDT 
PPIVSQFPGTMAKPPGSLARSSSLCRSRRSIVPSSPQPQRAQLA 
PHAPHPSHPRHPHHPQHTPHSLPSPDPDILSVSSCPALYRNEEE 
EEAI YFSAE KQWEVPDTASECDSLNSS I GRKQS PP / S KPRD I PN 
IIS/DERYQEMRCP*RIPSGGII J 


5850 


3 


1895 


KAVLNFSASGS VISLTGSNPMHDASMWHLKKNGI I VYLDVPLLN 
LICRLKLMKTDRIVGQNSGTSMKDLLKFRRQYYKKWYDARVFCE 
SGAS PEEVADKVLNAI KRYQDVDSETFISTRHVWPEDCEQKVSA 
E F F I EAVI EGLASDGGLFVPAKEFP KLS CGEWKSLVGAT YVERA 
QILLERCIH PAD I P AARLG EM I ETAYGENFACS KIAP VRHLSGN 
Q F I L E LFHG PTG S F KDLS L QLMPH I FAQCI P P S CNYM I L VATSG 
DTGSAVLNGFSRLNKNDKQRIAWAFFPENGVSDFQKAQIIGSQ 
RENGWAVGVESDFDFCQTAIKR I FNDSDFTG FLTVE YGT I LSSA 
NS INWGRLLPQWYHASAYLDLVSQGFISFGS PVDVCI PTGNFG 
N I LAAVYAKMMG I P I R K F I CASNQNHVWTD F I KTG \ HYDLRGKE 
N*AQTFFTVQ* I FLPNLSNLERHLHLMANKDGQLMTELFNRLES 
QHHFQ I EKALVEKLQQDFVADWCSEGECLAAINS TYNTSG YILD 
PHTAVAKWADRVQDKTCPVI ISSTAHYSKFAPAIMQALKIKE I 
NETSSSQLYLLGSYNALPPLHEALLERTKQQEKMEYQVCAADMN 
VLKSHVEQLVQNQFI 


5851 


3120 


1802 


RCYLQFLALLLTSTSARAAAAIAAAEEPAGSPSVMTRAGDHNRQ 
RGCCGSLiADYLTSAKFLLYLGHSLSTWGDRMWHFAVSVFLVELY 
GNS LLLTAVYGLWAGSVLVLGAI IGDWVDKNARLKVAQTSLW 
QNVS VI LCG 1 1 LMMVFLHKHE LLTM YHG WVLTS CYI LI I T IANI 
ANLAS TATA I T I QRDW I VWAGEDRS KLANMNAT I RRI DQLTN I 
LAPMAVGQIMTFGSPVIGCGFISGWNLVSMCVEYVLLWKVYQKT 
PALAVKAGLKEEETELKQLNLHKDTEPKPLEGTHLMGVKDSNIH 
ELEHEQEPTCASQMAEPFRTFRDGWVSYYNQPVF/LGWHGSCFP 
tiYDCPGL*LHHHRVRLHSGTEWFHPQYFDGS IS YNWNNGNCS FY 
LATSKMWFGSDRSDLRIGTAFLFDLVCDLCIHAWKPPGLVRFSF 


5852 


1 


422 


KTTFPSSLCPLRQLPEVRGYSGQPLTDPLISLCRSHKCRGKGWG 
SSSYPSLPALLRARSAPGHCTHRSCGPEWRIDSISRLEMQGARR 
SGWAQAQPTILLLVPRLRKSLPSIWG/SLMGPFITSGPG/WFRQ 
YYFFISGRH* VL FTESDFY YVAMDFGGHG LSSH YS PG VP YYLQT 
FVSE IRRWAGKKQSVYFRRCGGCSRAP PLITGGGVGSRKQRWP 
E SGAWALAPG L PAI HGRS WES 


5853 


223 


1346 


RLLGLS RVKGLHG PAASAW I S DPETRGD PGG P WGMWRGSDLRPR 
P VSLTGLTLVCK* AAQGPQV\HS VKLCFGLGG \ PCLL\ FP I FRP 
LLLHPRRPRLKPGTRGVAVEPHALRWHVAHGEEAGIRAAGPGH 
GGVEIPOG/VGSLGARRGLRPSRPSSRHRNRVPAPPPGRPLATP 
HRRRFPPDPALTCPGI^QDOGPREQQKQGSGRHDTIIiGDWGESE 
SRWVRGNFRTGTAATLI GFSRNPTLNGS ENWGSLVS IQEEG PDT 
GWEREKRNPAEMGNPQRWASP IHTPPLGPE ILRAMPEALRAMPE 
ALGLRPDPArS VPSALS/QTF/ PESWPRS CLRNQGETLGMG P VP 
LSSLCITESPSQNWTPCLLLLTCPRGLF 


5854 


86 


938 


KGRNTAPEKKGAALNNRENASS *NGY/ S RWKQDIRRIENHI IQE 
LKHLCAMIKRVLLERLENTRKLRELTEGRTLDWPQNRITEVSAK 
RQIVTEYREKGKRN*EEKKRDLEGRSRRYNLCIIGIPETEDRAS 
QAETIKDLLE/ENFPELKNELDLQMEKAHRIPLKFNEKKAASRH 
IRVTFL/KFQRRNILQASSQRKQVTYKGAKVRLTSDFSPAILNA 
RRQW/N/PISRVLRENNFEPRIIYSAKLSFLYKGNWKTFLDIQG 
LGKYINQELSLKILLKDLLQLTENLN 


5855 


536 


2391 


LRSYGCKAPSRISH3jHK\FLFLLLPSLLMGYSESPPPITDSWAP 
Fl SLTHHVLSQSQS PLS SNCWI CLSTHTQ* FTALPADLLTWTQS 
NVSLHISYLAIPFLADSFLKPV/L*PGNSAKHLSFKLSSLSMVS 
GRAVALLHLIASGLTSIQTNTASSKPPIWGY\LSTQTSFISPPP 

lclsrtypnpahatmvgqvpqslcgliftl/rtpcrpsilhpny 
ki i stsawqkvlcfsgs ptihts lhlttgssflsfhp 1 pgfpaa 
nsalwsslkgppgknvtipspvtgt*qpphrgsn/rltvdkdn 



388 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q-Glut amine, R-Arginine, 
C B Cpvin^ T=TViTf onine . V— v>» 1 i np 
W=Tryptophan, Y= Tyro sine , X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








F FLS P K PN S LHQ L P SQ \ TP YCALTG AALAGS YP I WENENT LS WL 
PTFTYKFCLSTPSLFFLCDTN*YLCLPANWSGTCTLVFOAPTIN 
I LPPNQT I LI S VEAS I S SSP IRNKWALHLI TLLTGLG I TAALGT 
GIAGITTS ITSYQTLFTTLSNTVEDMHTS I TSLQRQLDFLVGVI 

RAAEL*HQVADSWWQGSSLLRWIPWVAPFLGPLIFLFLLLMIGP 
CIFNLVSRFISQRLNCFIQASMQKHrDNIFHLCHV*YQSLRGNH 
SEAPEPRP 


5856 


173 


1137 


P WLHGLGLS AVFL FYL* / YVT FRLYGG 1 1 LLLLIF I S IAG I LYK 
FQDVLLYFPEQPSSSRLYVPMPTGIPHENIFIRTKDGIRLNLIL 
IRYTGDNS PYS PTI I YFHGNAGNIGHRLPNALLMLVNLKVNLLL 
VDYRGYGKSEGEASEEGLYLDSEAVLDYVMTSPDLDKTKIYLSG 

MRYLPLWCYKNKFLSYRKISQCRMPSLFISGLSDQLIPPVMMKQ 
LYELS PSRTKRLAI FPDGTHNDTWQCQGYFTALEQFI KE WKSH 
SPEEMAKTSSNVTII 


5857 


1597 


56"3 


KLIGKVLVLSVVADAMAAFAVEPQGPALGSEPMMLGSPTSPKPG 
VNAQFLPGFLMGDLPAP VTPQFRS I SGPS VGVMEMRS PLLAGGS 
P PQPWPAHKDKSGAP P VRS I YDD I S SPGLGSTPLTS RRQ PNI S 
VMQSPLVGVTSTPGTGQSMFSPASIGQPRKTTLSPAQLDPFYTQ 
GDSLTSEDH\LDDSWGDCIWGFLKASA\SYILL\QFAQYGGIS* 

DKSVMESSDRCALSSPSLAFTPPIKTLGTPTQPGSTPRISTMRP 
LATAYKASTSDYQVISDRQTPKKDESLVSKAMEYMFGW 


5858 


355 


1419 


PPHQPAAASTSXHQQQOPPPPPQDSSKPWAQGPGPAPGVGSAP 
PASSSAPPATPPTSGAPPGSGPGPTPTPPPAVTSAPPGAPPPTP 
P S S G V P TTP PQ AGG PP P P P AAVPG PG PG PKOG PG PGG P KGGKM P 
GGPKPGGGPGLSTPGGHPKPPHRGGGEPRGGRQHHPPYHQQHHQ 
GPPPGGPGGRSEEKISGPRRGFKANLSLLRRPGEKTYTQRCRFC 
LLGIYLLISRRMNSRRLFAKIWENQEKFLSTKAKDSEFIKLESR 
ALA*NCPKPELG+YTP*GGRQLPSSLFPTHACLPLSCSVIFSPF 

Mr V\l T IN \- maJK Kx r KFNIAj FniiNb/i V t-IN K ri\JU ¥ W e*\j r* 

FAS 


5859 


307 


1503 


GGS SAR PRAS S RRMLSRKKTKNEVS KPAEVQGKYVKKETS PLLR 
NLMPS FIRHGPTI PRRTDI CLPDSS PNAFSTSGDGWSRNQS FL 
RTP I QRTPHE I MRRESNRLSAPSYLARSLADVPRE YGSSQS FVT 
EVSFAVENGDSGSRYYYSDNFFDGQRKRPLGDRAHEDYRYYEYN 
HDLFQRMPQNQGRHASG IGR VAATS LGNLTNHGSEDL PLP PGWS 
VDWTMRGRKYY I DHNTNTTHWS HPLEREGLPPG WER VESSE FG T 
YYVDHTNKKAQY \RHPCAPTCTS V* STTS CHI/AS /RQQTERNQ 
S LLVPANP YHTAE I PDWLQVYARAPVKYDH ILKWELFQLADLDT 
YQGMLKLLFMKELEQIVKMYEAYRQALLTELENRKQRQQWYAQQ 
HGKNF 


5860 


2956 


1270 


TIRVEEFPLCPGGGKAQLSSASLLGAGLLLQPPTPPPLLLLLFP 
LLLFS RLCGALAGP 1 1 VE PHVTAVWG KNVS LKCL I E VNET I TQ I 
SWEKIHGKSSQTVAVHHPQYGFS VQGE YQGRVLFKNYS LNDAT I 
TLHNIG FSDSGKY I CKAVTFPLGNAQSS TTVTVL VEPTVSLI KG 
PDSLIDGGNETVAAICIAATGKPVAHIDWEGDLGEMESTTTSFP 
NETAT 1 1 SQ YKLFPTRFARGRR I TCWKHPALEKDI R YS FI LD I 
Q YAPEVS VTG YDGNWF VGRKGVNLKCNADANP PP FKS VWSRLDG 
QWPDGLLASDNTLHFVHPLTFNYSGVYICKVT\NSPGSKEVTQK 
VHPTFQDPSLPTYPPLPALQFQWASPSTA*TSRD\LATEP*KIA 
PS?LSTL\ATIKGWTQLPTIIA+CSGVGALFIV\LVKCFGLGIF 
CYRRRRTFRGDYFAKNYIPPSDMQKESQIDVLQQDELDPYPDSV 
KKENKNPVNNLIRKDYLEEPEKTQVWNVENLNRFERPMDYYEDL 
KMGMKFVSDEHYDENEDDLVSHVDGSVISRREWYV 


5861 


2051 


1305 


EVCACVQAFWLVASSGDDSQGGDKCGCEVGSWVGSMRWMARLL 
SEGEQGI PTACAAFAQQ PAG/EP RRGLAGVGEGGPQCS WVNYRC 
TLE FLVSLLGTDLARGRGNSAS GPTAPADS KQL/ ML * DVHRRVI 
LE * RMNSGS PARDNAPSQRFCTNLSEGLRFGI S PSWREALYGCH 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
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corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G-Glycine, 
H^Histidine, I-Isoleucine, K«=Lysine, 
L-Leucine, M^Methionine, N=Asparagine, 
P^Proline, Q*Glutamine , RaArginine, 
S=Serine, T=Threonine, V*Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








A 


5862 


1556 


483 


P P FQL I MGE I KVS PDYNW FRGTVPLKKI I VDDDDS KI WSLYDAG 
PRS IRCPLI FLPPVSGTADVFFRQ ILALTGWGYRVIALQYPVYW 
DHLEFCDGFRKLLDHLQLDKVHLFGASLGGFLAQKFAEYTHKSP 
RVHS L I LCNS FSDTS I FNQTWTANS FWLMPAFMLKKI VLGNFSS 
G P VD P MMAD A I D FMVE RL E SLGQS E LAS RLTLNCQNS YVE PH|CI 
RDIPVTIMDVFDQSALSTFJUCEEMYKLYPNARRAHLKTGGNFPY 
L CRS AE VNL YVQ IHL/R/RNSME PNTR P LTHQWS VPRS LRCRKA 
ALASARRSS S VS LAVNDELTRCVLV* S VASAPVSRPFPSGS SGS 
PVLTVSGK 


5863 


2714 


249 


P F P S RG S LP LAAP REDTMG PLMVL FC LL FLYPGLADS AP S C PQN 
VN I SGGTFTLSHGWAPGS LLTYSCPQGL YPS PASRLCKS SGQWQ 
TPGATRSLSKAVCKPVRCPAPVSFENGIYTPRLGSYPVGGNVSF 
ECEDG FI \LRGS PVRQCRPNGMWDGETAVCDNGAGHCPNPG I SL 
GP\VRTGFRFGHGDKVRYRCSSNLVLTGSSERECQGNGVWSGTE 
PICRQPYSYDFPEDVAPALGTSFSHMLGATNPTQKrKESLGRKI 
QIQRSGHLNLYLLLDCSQS VS ENDFL I FKESAS LMVDR I FS FBI 
NVS VAI I TFASEP KVLMS VLNDNS RDMTEVI S S LENANYKDHEN 

blbiril lAAbnbV I IjMMWWyMrcJjijIjMEl 1 P1AW \yriXKMAJL 1JLiJj\ 1 

DGK\SHMGGS PKTAVDHIRE I LNINQKRNDYLDI YAIGVGKLDV 
DMRELNELGSKIOX3ERHAFILQDTKALHQVFEHMLDVSKLTDTI 
CX3VGNMS ANAS DQERTPWHVT I KPKSQET\C\ RGALISDQWVLT 
AAHCFRDGNDHSLWRVNVGDP KSQWGKE FLIEKAVI SPGFDVFA 
KKNQGIL\EFYGD\DIALL\KLAQKVKM\STHCQGPSCLP\CTM 
\ EANLGFLRETFKGS TCR \DHENEL/ VWNKQS V\ PAHF\ VAL \N 

rjcvT iruT.TT.OMrvTrwTCff^DrT cdyvvtm\ ttdvtt •p V fivdit\ inrr 
Ljt> ivUiiriljl -UKMLj V e>W 1 oLLKbubf J\r<i\. 1 M \r tvi Li I \ U VKJl \ vvl 

D\QFL\CS\GPQEDESP\CK*E\SGGA\VFLERRFRLSAGGVWC 
SWGL\YNP\CLGSA\DKNSPKKGPSVAKVPPPTR/DFHIN\LFP 
Q*S PWLRQHPGGMS * I FLPLLANGHLS P FACPARI CRPLHFLPS 
EWATLRTL 


5864 


173 


1013 


PLISVPQSLISLPQPLLCFPGGQEPSAPSPCLYSFLWACSFTMG 
KLPPSIPPSSPLACVLKNLKPLQLTPDLKPKCLIFFCNTAWPQY 
KLDNDS K* PENGTFEFS ILQVLDNS CHKMGKWSEVPDVQAFF \ S 
HWSLPSLCSQC/GLIPNLSSFSPFCSFG/PPPQVPSP/TESFFS 
M DS SDIj P PS PO A A PR O AP PH Pfl 3 HT .A 3 A P P PYMP V T T<? P PHTW Q 
S LQFHS VTS PPPPAQQFTLKKVAGAKG I VKVSAPFS LSQIR * RL 
GSFSSNIKIQPSSWLIWQQP 


5865 


568 


1684 


CLPGPRWGEGWRAGHTIVGCIFFKTAIISHFKGGMYLCVCMCTC 
LSVCVCVQVGSWICV/CVSMCACVSLCTC\ICRCISMYTREHAC 
ACTRV*VYMCMS/VCTCVSTCIDVRVCAHVCVYMCLCLGYA*AC 
TCV*MCVCMHEHVCMC/VCACSCVLL/CRGHICM/MCMSAYICI 
/ CVY VCVLCVWACMRMSTCVWLVYG * ACTCVWMHM / CS CTCR/ C 
VHVCCMSMHACE CLCVYLH I CG CAGTRRW WAGSARGS R S CS RL P 
C WAPG PGL S LPG P S C P S VEQGLGGG PGQLQGRSGE ARLG EHRG W 
GSPAAVCSRNCTVSPRRGADCFEAPDVPKQPPGWGRASFEERGC 
GGRGWVCAPPLNGPQCCCFS IKPELKAKKKK 


5B66 


98 


3197 


AR P E VP AP PAWLS RRG AAKMGDKKDD KDS P KKNKG KERRDLDDL 
KKEVAMTEHKMS VEEVCRKYNTDCVQGLTHSKAQE I LARDGPNA 
LT PP P TT P EW VKF CRQLFGGFS I LLW I GA I LCFLA YG I QAGTE D 
D PSGDNLYLG I VLAAWI I TGC FS YYQ EAKS S K I ME S FKNMVP Q 
QALVIREGEKMQVNAEEWVGDLVEIKGGDRVPADLRI I SAHGC 
KVDNSSLTGESEPQTRSPDCTHE\NPLKTRNITFFSNNFVEGTA 
RGWVATGDRTVMGRIATLASGLEVGKTPIAIEIBHFIQLITGV 
AVFLGVSFFILSLILGYTWLEAVIFLIGI IVAJNVPEGLLATVTV 
CLTLTAKRMARKNCLVKNLEAVETLGSTS T I CSDKTG TLTQNRM 
TVAHMWFDNQIHEADTTEDQSGTSFDKSSHT^VALF*H/LLGFC 
NRPVFKGGQDNIPVLKRDVAGDASESALLKCIELSSGSVKLMRE 
RNKKVAEIPFNSTNKYQLSIHETEDPNDNRYLLVMKGAPERILD 
RCSTILLQGKEQPIjDEEMKEAFQNAYLELGGLGERVLGFCHYYL 
PEEQFPKGFAFDCDDVNFTTDNLCFVGLMSMIGPPRAAVPDAVG 
KCRSAGI KVIMVTGDHP I TAKAIAKGVG 1 1 FSGNETVED IAARL 
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amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C*Cysteine, D»Aspartic Acid, E- 
Glutamic Acid, F=Phenyl alanine, G»Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q-Glutamine, R-Arginine, 
S-Serine, T-Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








NIPVSQVNPRDAKACVIHGTDLKDFTSEQIDEltQNHTEIVPAR 
TS PQQKL 1 1 VEGCQRQGA I VAVTGDG VNDS PALKKAD I GVAMG I 
AGS DVS KQAADMI LLDDNFAS I VTGVEEGRLIFDNLKKS IAYTL 
TSNIPEITPFLLFIMANIPLPLGTITILCIDLGTDMVPAISLAY 
EAAESD I MKRQPRNPRTDKLVNERLI SMAYGQIGM I OALGGFFS 
YFVILAENGFLPGNLVGIRLNWDDRTVNDLEDSYGQQWTYEQRX 
WE FTCHTAFFVS I WVQWADL 1 1 CKTRRNS VFQQGMKNKIL I F 
GLFEETALAAFLSYCPGMDVALRMYPLKPSWWFCAFPYSFLIFV 
YDE I RKLI LRRNPGGWVEKET YY 


5867 


3 


1485 


^t^^x-rt^^^Kij lajw t* rs\uf\j^LH^ P vAKP 
GPVKTLTRKKNKKiOCRFWKSKAREVSKKPASGPGAVVRPPKAPE 
DFSQNWKALQEWIiLKQKSQAPEKPLVISQMGSKKKPKIIQQNKK 
ETSPQVKGEEMPAGKDQEASRGSVPSGSKMDRRAPVPRTKASGT 
EHNKKGTKERTNGDIVPERGDIEHKKRKAK\GQPQPHPPR/IDI 
WFDDVDPADIEAAIGPEAAKIARKQLGCSEGSVSLSLVKEQAFG 
GLTRALALDCEMVG VGPKGEESMAAR VS I VNQYGKCVYDKYVKP 

TFPVTnYRTiVQr3TT3DTTlJT.I^rW7T?T7T V\T\Tr\lTZ?\TA.C»Jir VrtnTT t 7/"<tt 
i c*r v i^iAvovjXK.t'iiriiijrvyvjCiiiijiiV vyi\j£VAcir^l\\jRljjVGH 

ALHNDLKVLFLDHPKKKIRDTQKYKPFKSQVKSGRPSLRLLSEK 

I LGLQVQQAEHCS IQDAQAAMRLYVMVKKEWESMARDRRPLLTA 

PDHCSDDA* QS CPAAAAAPLQRQCDQSQGQ I TS PQSGNSGE TFS 

ESWQRGVAWCY 


5868 


2122 


833 


LTAGASHTQDASQSTS AKYPAAAQNL/ CVTNAMREDLADI W YIR 
AVTVYDKPASFFKETPLDLQHRLFMKLGSMHSPFRARSEPEDPV 
TERSAFTERDAGSGLVTRLRERPALLVSSTSWTEDEDFSILLAA 
LESRV*T\MTLDGHNLPSLVCVITGKGPLREYYSRLIHQKHFQH 
IQVCTPW^EAEDYPLLUJSADLXSVCLHTSSSGLDLPMKVVDMFG 
CCLPVCAVNFKCLHELVKHEENGLVFEDSEELAAQLQMLFSNFP 
DPAGKUiQFPJCNIjRESQQLRWDESWVQTVLPLVMDT 


5869 


2122 


833 


LTAGASHTQDASQS TSAKYPAAAQNL / CVTNAMREDLAD I WY I R 
AVTVYD KPAS FF KETP LDLQHRLFMKLGSMHS PFRARSEPBDP V 
TERSAFTERDAGSGLVTRLRFRPAT.T.V^QT^WTPnPn'PciTT t na 
LE SR V * T \ MTLDGHNL PS L VCV ITG KG PLRE Y YS RL IHQ KH FQH 
IQVCTPWLEAEDYPLLLGSADLGVCLHTSSSGLDLPMKWDMFG 
CCLPV(^VNFKCliHELVKHEENGLVFEDSEELAAQLQMLFSNFP 
DPAGKLNQFT^KNLRESQQLRWDESWVQTVLPLVMDT 


5870 


2122 


833 


LTAGASHTQDASQSTSAKYPAAAQNL/CVTN^EDLADlWYIR 
AVTVYDKPAS F FKETPLDLQHRLFMKLGSMHS P FRARSE PEDP V 
TERSAFTERDAGSGLVTRLRERPALLVSSTSWTEDEDFSILLAA 
LESRV * T \MTLDGHNLPS LVCVITG KG PLRE YYSRLI HQKHFQH 
IQVCTPWLEAEDYPLLLGSADLGVCLHTSSSGLDLPMKVVDMFG 
CCXPVCAVNFKCLHELVKHEENGLVFEDSEELAAQLQMLFSNFP 
D PAG KLNQ FRKNLRES QQ LR WDES WVQ TVLPL VMDT 


5871 


3 


3465 


F FFCRPLRLYS KTTGDRS AMAGAAGLTAE VS WKVLERRARTKRS 
VLKLL*LSLRRL*LEPTI*NGLLT*CSRLSVFRFLKV\GSVYEP 
LKSINLPRPDNETLWDKLDHYYRIVKSTLLLYQSPTTGLFPTKT 
CGGDQ KAK I QD S L YCAAGAWALALAYRR I DDDKGRTHE LEH S A I 
KCMRG I L YCYMRQADKVQQFKQDPRPTTCLHS VFNVHTGDELLS 
YEE YGKLQ I NAVS LYLLYL VEM 1 3 SGLQ 1 1 YNTDE VS F I QNLVF 
CV\ ERVYRVP \DFG \ VWGKREGKYY * / SGS TELHSS S VGLGKRQ 
L * KQFNG FNLFGNQGCSWSV I FVDLDAHNRNRQTLCS LLPRESR 
SHNTDAALLPCISYPAFALDDEVLFSQTLDKVVRKLKGKYGFKR 
FLRDGYRTSLEDPNRCYYKPAE I KLFDG I ECEFP I FFLYMMIDG 
VFRGNPKQVQEYQDLLTPVLHHTTEGYPWPKYYYVPADFVEYE 
KNNPGSQKRFPSNCGRDGKLFLWGQALYIIAKLLADELISPKDI 
DPVQRYVPLKJDQRWSKRFSNC^PLEM)LVVHVALIAESQRLQV 
FLNTYGIQTQTPOQVEPIQIWPQQELVKAYLQLGINEKLGLSGR 
PDRPIGCLGTSKIYRILGKTWCYPIIFDLSDFYMSQDVFLLID 
DI KNALQFI KQ YWKMHGRPLFLVL IREDNI RGSRFNP ILDMLAA 
LKKGIIGGVKVHVDRLQTLISGAWEQLDFLRISDTBELPEFKS 
FEELEPPKBSKVXRQSSTPSAPELGQQPDVNISEWKDKPTHEIL 
Q KLNDCS CLAS Q A I L LG I LL KR EG PNF I T KEGTVSDH I ER VYRR 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing Bignal peptide 
{A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W«Tryptophan, Y-Tyrosine, X= Unknown , +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








AGS QKLWS WRRAAS LLS KWDS LAPS I TNVL VQGKQ VTLG AF G 
HEEEVISNPLSPRVIQNIIYYKCNTHDEREAVIQQELVIHIGWI 
I SNNPELFSGTLKIRIGWI IHAME YELQIRGGDKPALDLYQLS P 
SEVKQLLLD Il^PQQNGRCWLNRRQIDGSLNRTPTGFYDRVWQ I 
LERTPNGIIVAGKHLPQQPTLSDMTMYEMNFSIjLVEDTLGNIDQ 
PQYRQIWBLLMWSIVLERNPELEFQDKVDLDRLVKEAFNEFQ 
KLX?SRLKEIEKQDDMTSFYNTPPLGKRGTCSYLTKA\^INLLLEG 
E VKPNNDDP CL I S 


5872 


68 


665 


VQGYMYRFVIKINSCYSEKTSICRHRCCPELPATQPWPTPTVFF 
NI AIDS ESLGC I \ S FKLFADKV/ P KRWKKNFVLLNTGEKVLGDK 
GPCFYRIIPG\LCQGGDFTHHNGTGGKSLYSKEFDDENFI/LKH 
T APGVLS T ANAG PTTNGSQ F F I CTAKTEDG + QHWFG KVKDGMS 
IVEALERSGSRNGKTS KKITAANCGQL 


5873 


2240 


506 


RRPPEGGSGGGRRTRARMPLPWSLALPLLLSWVAGGFGNAASAR 
HHGLLASARQPGVCHYGTKLACCYGWRRNSKGVCEATCEPGCKF 

YKCFCLSGHMLMPDATCVNSRTCAMINCQYSCEDTEEGPQCLCP 
SSGLRLAPNGRDCLDIDECASGKVICPYNRRCVNTFGSYYCKCH 
IGFELQYISGRYDCIDINECTMDSHTCSHHANCFNTQGSFKCKC 
KQGYKGNGLRCSAI PENS VKEVLRAPGTIKDRI KXLLAHKNSMK 
KKAKI KNVTPE PTRTPTPKVNLQP FNYEEIVSRGGNSHGG\ KKG 
NEEKMKEGLEDEKREEKAL.KD*HRRERPFRG\DVFFPKVNEAGE 
FG LI L \ VQRKALTS KL EHKADLN I S VDCS FNHG \ I C DW \ KQDR \ 
EDDFDW\NPADR\DNAI \GFY\MAVPGLWQGHK\ KDIGRLKLLL 
PDLQPQSNFCLLFDYRLAGDKVGKLRVFVKNSNNALAWEKTTSE 
DEKWKTGKIQLYQGTDATKSIIFEAERGKGKTGEIAVDGVLLVS 
GLCPDSLLSVDD 


5874 


2 


3387 


ACPRLARRRRRVRS LRRRRGWLRARWS RGQNNMAARR I TQETFD 
AVLQEKAKR YKMDAS GEAVS E TLQ F KAQDLLRAVP RS RAE M YDD 
VHSDGRYSLSGSVAHSRDAGRESLRSDVFSGPSFRSSNPSISDD 
SYFRKECGRDLEFSHSNSRDQVIGHRKLGHFRSQDWKFALRGSW 
EQDFGHPVSQESSWSQEYSFGPSAVLGDFGSSRLIEKECLEKE\ 
SRDYD VDHSG \ E A\ D S VLRGS \ SQVQ A\ RGRALN I VDQEGS L LG 
. KG ETQGLLTAKGGVG KLVTLRNVSTKKI PTVNRI TPKTQGTNQ I 
QKNTPSPDVTLGTNPGTEDIQFPIQKIPLGLDLKNLRLPRRKMS 
FDI IDKSDVFSRFGI EI I KWAGFHT I KDDI KFSQLFQTLFELET 
ETCAKMLAS FKCSLKPEHRDFCFFTI KFLKHSALKTPRVDNEFL 
NMLLDKGAVKTKNCFFEI I KPFDKYIMRLQDRLLKS VTPLLMAC 
NAYELSVKMKTLSNP LDLALALETTNS LCRKSLALLGQTFS LAS 
SFRQEKIL*AVGLQDIAPSPAAFPNF£DSTLFGREYIDHLKAWL 
VSSGCPLQVKKAEPEPMREEEKMIPPTKPEIQAKAPSSLSDAVP 
Q RADHR WGT I DQ L VKR VI EGSLS PKE RTLLKED PA YW FLS DEN 
SLEYKYYKLKLAEMQRMSENLRGADQKPTSADCAVRAMLYSRAV 
RNLKKKLLP\WQRRGLLRAQG\LRG\WKARRA\TTGTQTLLFLR 
APGLKHHGRQAPGLS\QAKPSLPDRND\AAKD\CPLDPV\GPSP 
QDPSLEASGPSPKPAGVDISEAPQTSSPCPSADIDMKDNGRTAE 
KLARFVAQVG \PEIEQF\SI \ ENS TDNPDLWFL\KDQNS S \ AFK 
FY\RKKVFELCPSICFTSSPHNL\HTGGGDTT\GSQESPVDLME 
G EAE FEDE PP PRE AE L E S PE VM P EEED E DD EDGGEEAP A\ PGRG 
GPSLEGSTPADGLPGEA\AEDDL/ALGAPALFTGLLQVTCFPFG 
RGFSSKSLKVGMIPAPKRVCLIQEPKVHEPVRIAYDRPRGRPMS 
KKKKPKELDFAO^KL\TDK\NIjGFQ\ML0KMGWKEGHGIjGSLGK 
G I R \ SRS ACTQQAAWGGSGWGIiS PS TCS LP LGS FTAKMAYS WQL 
IFVF 


5875 


296 


1848 


IJUU^LPLWRLSRRGFREYLIXSLSAPSALGGAMRSVSYVQRVA 
LEFSGS LFPHAI CLGDVDNDTLNELWGDTSGKVS VYKNDDSRP 
WLTCS CQGMLTCVGVGDVCNKGKNLLVAVS AEGWFHLFDLT PAK 
VLDASGHHETLIGEEQRPVFKQHI PANTKVMLISDIDGDGCREL 
WG YTDRWRAFRWEELGEGPEHLTGQLVS LKKWMLEGQVDS LS 
VTLGPLGLPELMVSQPGCAYAILLCTWKKDTGSPPASEGPTDGS 
/SGDPSCPRRGAAPDIWPYPQQECLHSPNWQHQT\SHGTESSGS 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, B = 
Glutamic Acid, F*Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K*Lysine, 
L=Leucine, M^Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V- Valine, 
W= Tryptophan, Y-Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








GLFALCTLDGTLKLMEEMEEADKLLWSVQVDHQLFALEKLDVTG 
NGHEEWACAWDGQTY I IDHNRTVVRFQVDENIRAFCAGLYACK 
EGRNSPCLVYVTFNQKIYVYWEVQLERMESTNLVKLLETKP\ST 
TACCRSWAWILTTSL*LVPCFTKRSTIQTSHHSVLPQASRIPPS 
WTCLIAGEGFF*TPTLPPKGVFGSHCAAAGSITKQ 


5876 


1122 


224 


H L P LG VP S KVAGAAAME P Q EE RE TQ VAAWLKK I FGDHP I PQYE V 
N P RTTE I LHHLS E RNR VRDRD V YL VI E DLKQ KAS E YE S EAKYLQ 
DLLMESVNFSPANLSSTGSRYLNALVI1SAVALETKDTSLASFIP 
AVNDLTSDLFRTKSKSEE I KI ELEKLEKNLTATLVLEKCLQED V 

kkaelhlster\akvdnrrqnm\dflkakseefrfgiqaageql 
sargq\dafsvpiqslvalirenwprlkqqtiplk\kklesyld 
lmp\npshcsk*rieeak\rela\sieaeltrrvs\mmel 


5877 


2030 


1907 


GTLGKMAAS S SGEKE KERLGGGLG VAGGNSTRERLLSALEDLEV 
LSRELIE^IIAISRNQKLLQAGEENQVLELLIHRDGEFQELMFCLA 
LNQGK I HHEMQVLEKEVE KRDS DI QQLQKQLKEAEQ I LATAVYQ 
AKE KL KS I E KAR KGA I S S E E 1 1 KYAHR I S ASNAVCAPLTWVPGD 
PRRPYPTDLEMRSGLLGQMNNPSTNGVNGHLPGDALA/RRKIAR 
CPCSTVS/NGSQMTCR+INIILILQKSVCEL 


5878 


950 


2113 


GL W KCMQLQG PHTHR VQP * PTPRQQGPQ \ VPVAVI AGNRPNYL Y 
RMLRSLLSAQGVSPQMITVFIDGYYEEPMDWALFGLRGIQHTP 
ISIKNARVSQHYKASLTATFNLFPEAKFAWLEEDLDIAVDFFS 
FLSQS I HLLEEDDSLYCISAWNDQGYEHTAEDPALLYRVETMPG 
LGWVLRRSLYKEELEPKWPTPEKLWDWDWWMRMPEQRRGRECII 
PD VS R S YHFG I VGLNMNG Y FHEA Y F KKHKFNTVPG VQLRNVDSL 
KKEA YE VEVHRLLS EAE VLDRSKN P CEDS FL PDTEGHT YVAF I R 
ME KDDD FTTWTQ LAKCLH I WDLD VRGNHRG LWRL FRKKNH FL W 
GVPASPYSVKKPPSVTPIFLEPPPKEEGAPGAPEQT 


5879 


3 


981 


RLTEAAAAGSGSRAAGWAGSPPTLLPLSPTSPRCAATMASSDED 
GTNGGASEAGEDREAPGKRRRLGFLATAWLTFYDIAMTAGWLVL 
AI AMVRF YME KGTHRGLYKS I QKTLKFFQTFALLE IVHCL IG I V 
PTSVIVTGVQVSSRIFMVWLITHSIKPIQNEESWLFLVAWTVT 
EITRYS FYTFSLLDHLPYFI KWAR YNFFI I L Y PVG VAGELLT I Y 
AALPHVKKTGM FS IRLPNKYNVS FD YY YFLL I TMAS YI PLFPQL 
Y FHMLRQRRKVLHG \G * L * KRM IK * S LQTRCFFQNNQDYLS PS F 
NNKNKQLCEISWIVWFLKI 


5880 


1138 


1324 


S LWCLVAGGLGLGPSSQNPLQRAG I IiARPREARGTFSALTACS A 
SVTSKGKSSSGMWPSAASDRDSPVPLRPPGPVQLPSGTGWVLSD 
♦KKKRGRCSS/WLSQPQHEREKEWLLRRSMAEGERARAASDVL 
CRSLANETHQLRRTLTATAHMCQHLAKCLDERQHAQRNVGERSP 
DQSEHTDGHTSVQS VI EKLQE ENRLL KQKVTHVEDLNAKWQR YN 
ASRDEYVRGLHAQLRGLQIPHEPELMRKEISRLNRQLEEKINDC 
AEVKQELAASRTARDAALERVQMLEQQ I LAYKDDFMS ERADRER 
AQSRIQELEEKVASLLHQVSWRQDSREPDAGRIHAGSKTAKYIiA 
ADALELMVPGGWR PGTGS QQ P E P PAEGGHPGAAQRGQGDLQC PH 
CLQCFSDEQGEELLRHVAECCQ 


5881 


26 


441 


GG I HPS PTEAPRAQHLTMDCTWR I LFLVAAATGTHAQVQLLQSG 
SEVKKPGASVMVSCYVSGYTLTKLSMHWVRQAPGKGLE*MGPFD 
LQDVETIYPQKFQGRVSMTEETSTETTQ/AYLELSSLRSEDTAV 
HHCATDTV 


5882 


2407 


2216 


SGCVEMiYSHSLEYNPEWISVQSAVAPAQLALNSDGDL*LHSGE 
RTRRD*QLPEAGGPGLQEPLQLGELDITSDEFILDEVDG\VDLR 
HYSKQVELELQQIEQKSIRDYIQESENIASLHNQITACDAVLER 
MEQMLGAFQSDLSSISSEIRTLQEQSGAMNIRLRNRQAVRGKLG 
ELVDGLWPSALVTAI LEAP VTEPRFLEQLQELDAKAAAVREQE 
ARGTAACADVRGVLDRLRVKAVTKI REF I LQKIYS FRKPMTNYQ 
I PQTALLKYRFFYQFLLGNERATAKE I RDE YVETLSK I YLS YYR 
S YLG RLMKVQ YEE VAE KDDLMG VE DTAKKG FFS KP S LRS RNT I F 
TLGTRGSVISPTELEAPILVPHTAQRGEQRYPFEALFRSQHYAL 
LDNSCREYLFICEFFWSGPAAHDLFHAVMGRTLSMTLKHLDSY 
LADCYUA.IAVFLCIHIVLRFRJ^IAAKRDVPALDRYWEQVLALLW | 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, ! 
L=Leucine, M=Methionine , N»Asparagine , 
P=Proline, Q-Glutamine, R-Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








PRFELI LEMNVQS VRSTD PQRLGGLDTRPH Y I TRRYAE PS S ALV 
S INQT I PN ERTMQL LGQ LQ VE VEN FVLR VAAE FSSRKEQLVFLI 
NNYDMMLGVLM\E * ERAADDSKEVES FQQLLNARTQEFIEELLS 
PPFGGLVAFVKEAEALI BRGQAERLRGEEARVTQL I RGFGS S WK 
S S VE S LS QD VM RS FTN FRNGTS 1 1 QG ALTQL 1 0 \ L YHRFHR V \ L 
SQPQLRALPARAELINIHHLMVELKKHKPNF 


5883 


2 


1374 


E FPGRRFRAVME AGAGAGAGAAGWS CPGPGPTVTTLGS YEAS EG 
CERKKGQRWGSLERRGMQAMEGEVLLPALYEEEEEEEEEEEEVE 
EEEEQVQKGGSVGSLSVNKHRGLSLTETELEELRAQVLQLVAEL 
EETRELAGQHEDDSLELQGLLEDERLASAQCAEVFTKQIOQLQG 
ELRSLREEISLLEHEKESELKEIEQELHLAQAEIQSLRQAAEDS 
ATEHESDIASLQEDLCRMQNELEDWER I RGDYEME IASLRAEME 
MKS S EPSGS LGLS DYSGLQEELQELRERYHFLNBE YRALQESNS 
SLTGOLADLESERTQRATERWLQSQTLSMTSAESQTSEMDFLEP 
DPEMQLLRQQLRDAEEQMHGMKNKCQELCCELEELQHHRQVSEE 
E QRRLQRE LKCAQNE VLRFQTSHS \ S PS HPLPP I P PS S PCLL * A 
LWI S ALLWCWWAETS S 


5884 


4261 


2522 


GVLARASARLRVPLTGVRACAE P E VGAE PAKVAGAAE PDEDGGR 
S RLRDCGD YT P S ERLG P KGAML W FQGA I PAAIATAKRSGAVFW 
FVAGDDE0STOMAASWEDDKVTEA t ?^N^KVATK"TnTVQB , ar , T m? 

sqiypwcvpssffigdsgipleviagsvsadelvtrihkvrqm 
hllksetsvangsqsessvstpsasfepnntcensqsrnaelce 
ipstsdtksdtatggesaghatssqepsgcsdqrpaedlnirve 
rltkkleerreekrkeeeqreikkeierrktgkemldykrkq.ee 
eltkrmleernrekaedraarer i kqq i aldraeraarfaktke 
e veaakaaallakqaeme vkres yarerstvariqfrlpdgss f 
tnqfpsdapleearqfaaqtvgntygnfslatmfprreftkedy 
kkklldlelapsaswllp/alfinf*agrptasivhsssgdiw 
tliigtvlypflaiwrlisnflfsnppptqtsvrvtsseppnpas 
ssksekrepvrio^vlekrgddfkkegkiyrlrtqddgedenntw 
ngnstqqm 


5885 


900 


467 


AAGGGRRSRLSRSWPTGPSKSPSGVRCCG\RR\AWEDKDEFLDV 
IYWFRQIIAWLGVIWGVLPLRGFLGIAGFCLINAGVLYLYFSN 
YLQ IDEEEYGGTWELTKEGFMTS FA/ I VHGHLDHLLHCHPL* LM 
VYSSQVLPIQSKGPS 


5886 


86 


1341 


PFRGRALTLKKQPRPGVAPPSU3TCHKSDPGRPAAQSQPPSPGS - 
GTFGLLSFRMVRTKTWTLKKHFVGYPTNSDFELKTSELPPLKNG 
EVLLEALFLTVDPYMRVAAKRLKBGDTMMGO^VAKVVESKNVAL 
PKGT I VLAS PGWTTHS I SDGKDLEKLLTEWPDT I PLSLALGTVG 
MPGLTAY FGLL E I CGVKGGETVMVNAAAGAVGS WGQIAKLKGC 
KWGAVGSDEKVAYLQKLGFDWFNYKTVESLE ETLKKAS PDGY 
DCYFDNVGGEFSNTVIGQMKKFGRIAICGAISTYNRTGPLPPGP 
PPEIGIYQELRMEAFWYRWQGDARQKALKDLLKWVLELPYFVI 
D*LQANTLVYKSMKSAKPSLEYISEKLVSG\KIQYKEYIIEGFE 
NMPAAFMGMLKGDNLGKTI VKA 


5887 


1937 


104 


APGCRGCRATR C P CRG PR WDS LGDE AARS PAAPGGAPGLLGLRE 
RPDRCHPGGDDRGPQLHRGSPG/SPSELSRRPGPPGLPGLQGPP 
PAPGLPQSRTL/PVLCVCDLSPAQCDINCCCDPDCSSVDFSVFS 
ACSVPWTGDSQFCSQKAVIYSLNFTANPPQRVFELVDQINPSI 
FC I H ITN\ * NLH YPLLIQKYL/NENN FDTLMKTS DGFTLNAES Y 
VSFTTKLDIPTAAKYEYGVPLQTSDSFLRPPSSLTSSLCTDNNP 
AAFLVNQAVKCTRKINLEQCEEIEALSMAFYSSPEILRVPDSRK 
KVP ITVQS I VIQSLNKTLTRREDTDVLQPTLVNAGHFSLCVNW 
LEVKYSLTYTDAGEVTKADLSFVLGTVSSVWPLQQKFEIHFLQ 
ENTQ P VP LS GNPG YWGLPLAAG FQPH KGSG 1 1 QTTN R YGQLT I 
LHS TTEQDCLALEGVRTPVL FG YTMQSG CKLRLTGAL PCQLVAQ 
KVKS LLWGQGFPDYVAPFGNSQGP / ADMLDW VP 1HF I TQSFNRK 
DS CQLPGALVI E VKWTKYGS LLNPQAKI VNVTANLI S S S FPEAN 
SGNERTILISTAVTFVDVSAPAEAGFRAPPAINARLPFNFFFPF 
V 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F~ Phenyl alanine, G=Glycine, 
H=Histidine, I*=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P»Proline, Q-Glutamine, R=Arginine, 
S-Serine, T=Threonine, VsValine, 
W=Tryptophan, Y=Tyrosine, X=UnJcnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 


5888 


375 


2302 


LLCRTPGVAMQRADSEQPSKRPRCDDSPRTPSNTPSAEADWSPG 
LE LHPDYKT WG P EQVCS FLRRGGFEE PVLLKN I RENE I TGALLP 
CLDESRFENIX3VSSLGERKKLLSYIQRLVQIHVDTMKVINDPIH 
GHIELHPLLVRIIDTPQFQRLRYIKQLGGGYYVFPGASHNRFEH 
S LG VG YLAQCLVHALGEKQPE LQ I S ERD VLCVQ I AGLCHDLGHG 
PFSHMFDGRF I PliARPEVKWTHEQGSVMMFEHLINSNG I KPVME 
QYGLIPEEDICFIKEQIVGPLESPVEDSLWPYKGRPENKSFLYE 
IVSNKRNGIDVDKWDYFARDCHHLGIQNNFDYKRFIKFARVCEV 
DNE LR I CARDKE VGNLYDM FHTRNS LHRRAYQHKVGK I IDTMIT 
DAFLKADD Y I E I TGAGGKKYRIS TAI DDMEAYTKLTDNI FLEIL 
YSTDPKLKDAREILKQIEYRNLFKYVGETQPTGQIKIKREDYES 
h P KE VASAKP KVLLDVKLKAEDF I VDVINMD YGMQEKNP I DHVS 
F YCKTAPNRAI R I TKNQ VSQLLP \ E KFAEQ \ L I RVYCKKVDRKS 
LYA\ARQYFVQW\CADR\NFT\KPQDGRCY*PPTP*HPQKKGW\ 
NDSTFSPKIPTRLPRRLPKSRV\QLFKDDPM 


58B9 


1831 


731 


LPAACGRPVTARPRQAPEGRSGRPRDLDPYPPQVFPPRPDRVAI 
VTGGTDGIGYSTAKHLARLGMHVI IAGNNDS KAKQWSKI KEET 
LNDKET*VLLCCPGWLCLWNSSDPPTSASRGAGTTGVHHHFLLK 
FG I FI L\DLASMTS IRQ FVQKFKMKKI PLHVL INNAGVMMVPQR 
KTRDGFEEHFGLNYLGHFLLTNLLLDTLKESGSPGHSARWTVS 
SATHYVAELNl^DIjQSSACYSPHAAYAQSKliALVLFTYHLQRLL 
AAEGSHVTANWDPGVVNTDLYKHVF^ATRLAKKLLGWLLFKTP 
DEGAWTSIYAAVTPELEGVGGRYLYNKKETKSLHVTYNQKLQQQ 
LWSKS CEMTGVLDVTL 


5890 


1322 


200 


FRRGWS AAGRAVP VAFCS R I SASSPRRPRGAVRLQSGTEAACRS 
GRPDPRPASAAGGHAGERMSQRDTLVHLFAGGCGGTVGAILTCP 
LE WKTRLQS S SVTLY I S E VQLNTMAGAS VNR WS PGPLHCLKV 
I LEKEG PRSL FRGLGPNLVGVAP SRAI YFAAYSNCKEKLND VFD 
PDSTQ VHMI SAAMAG FTAI TATNPI WLI KTRLQL * / S QGTAGKR 
RMGAFEC VRKVYQTDGLKGFYRGMSAS YAGI SETVI HFV I YES I 
KQKLLEYKTASTMENDEESVKEASDFVGMMLAAATSK\LVATTI 
AYPHEWRTRLREEGTKYRSFFQTLSLLVQEEGYGSLYRGLTTH 
LVRQ I P \NTA I MMAT YELWYLLNG 


5891 


1322 


200 


FRRGWS AAGRAVP VAFCSR I SAS S PRRP RGAVRLQSGTEAACRS 
GRPDPRPASAAGGHAGERMSQRDTLVHLFAGGCGGTVGAILTCP 
LEWKTRLQSS S VTLYISE VQLNTMAGAS VNRWS PGPLHCLKV 
I LEKEG PRS LFRGLGPNLVGVAPSRAI YFAAYSNCKE KLNDVFD 
PDSTQVHMISAAMAGFTAITATNPIWLIKTRLQL* /SQGTAGKR 
RMGAFECVRKVYQTDGLKGFYRGMSASYAGISETVIHFVTYESI 
KQKLLEYKTASTMENDEESVKEASDFVGMMLAAATSK\LVATTI 
AYPHEWRTRLREEGTKYRSFFQTLSLLVQEEGYGSLYRGLTTH 
LVRQ I P \ NTAI MMAT YE LWYLLNG 


5892 


1764 


379 


wlrvcgrlsvnsavssrtggwsagltcamqrlOvVLghlrgpa 
dsgwmpqaapclsgaphasaadvvvvhgrrtaicragrggfkdt 
tpdellsavmtavlkdvnlrpeqlgdicvgkvlqpgagaimari 
aqflsd i petvplstvnrqcs sglqavas i aggirngs ydigma 
cg vesms ladrgkpgn i tsrlmekekardcl i pmg i ts envaer 
fgi sre kqdtfalasqqkaaraqs kgcfqae i vp vtttvhddkg 
tkrsitvtqdegirpsttmeglaklkpafkkdgsttagnssqvs 
dgaaaillarrskaeelglp i lgvlrs yawgvp pd img i gpay 
a i p valqkagltvsdvd i feine \afas qaaycveklrlp p * eg 
* tplggasgp * ghplglhwghvqvitlaq * s * s argkrayrsgc 
pcaigswngsplpvfeypwgt 


5893 


3 


1653 


I LS KRRCQKAKTKELMAKKVAVI GAGVSGL IS LKCCVDEGLE PT 
CFERTEDIGGVWRFKENVEDGRASIYQSVVTNTSKEMSCFSDFP 
MPEDFPNFLHNSKLLEYFRIFAKKFDLLKYIQFQTTVLSVRKCP 
DFSSSGQWKWTQSNGKEQSAVFDAVMVCSGHHI LPHI PLKSFP 
GMER FKGQYFHSRQYKHPDGFEGKR I LVIGMGNLGSDI AVE LSK 
NAAQVFISTRHGTWVMSRISEDGYPWDSVFHTRFRSMLRNVLPR 
TAVKWMI EQQMNRW FNH ENYGLE PQNKY I MKE P VLNDD VP S RLL 
CGAI KVKSTVKELTETSAI FEDGTVEENIDVIIFATGYSFSFPF 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine / I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V-Valine, 
W-Tryptophan, Y-Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\apossible nucleotide insertion) 








LEDSLVKVENNMVSLYKYIFPAHLDKSTLACIGLIQPLGSIFPT 
AELQARW VTRVFKGLCS LPS ERTMMMD 1 1 KRNEKR I DLFGES QS 
QTLQTTfyVDYLDEIJUjEIGAKPDFCSLLFKDPKIjAVRLYFGPCN 
S Y* YRLVGPGQWEGARNAI FTQKQR I LKPLKTRALKDSSNFS VS 
FLLKI LGLLAWVAFF\ CQLQWS 


5894 


174 


1673 


RYS P KKVLQNKESSLKLGMATALVS AHS LAPLNIjKKEGLRVVRE 
DHYSTWEQGFKLQGNSKGLGQEPLCKQFRQLRYEETTGPREALS 
RLRELCQQWLQPETHTKEHILELLVLEQFLIILPKELQARVQEH 
HPESREDWWLEDLQLDLGETGQQVDPDQPKKQKILVEEMAPL 
KGVQEQQVRHECEVTKPEKEKGEETRIENGKLIVVTDSCGRVES 
SGKISEPMEAHNEGSNLERHQAKPKEKIEYKCSEREQRFIQHLD 
LIEHASTHTGKKLCESDVCQSSSLTGHKKVLS*ERKVIQC\HGV 
LGKAFQRSSHLVRHQKIHLGEKPYQCNECGKVFSQNAGLLEHLR 
IHTGEKPYLCIHCGKNFRRSSKLNRHQRIHSQEEPCECKEOGKT 
FS QALLLTHHQ RIHSHSKSHQ CNE CG KAFS LTS DL I RHHR I HTG 
EKPFKCN I CQKAFRLNSHIAQHVRIHNEEKPYQCSECGEAFRQR 
SGLFQHQRYHHKDKLA 


5895 


2967 


86 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE 
MRLFVSEXj\TGCLFVLAAAGRARGRAEVLISTVGPEDCVVPFLT 
RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDLTNQWLEW 
EATELQPTLS AALY YL \ WQGKKG \ EDVLGS VRRTLTHI DHSLS 
RQ \NC PF LAGETES LAD I VLWG AL Y PLLQD P AYL P EELS ALHS W 
FQTLS TQ \ E P CQR \ AARRL VLKQ \ QGVLALR \ P YLQKQ PQ P SP A 
EGKGLS P I E PEEEE LATLS EEE I AMAVTAWE KGLES LP PLRPQQ 
NPVLPVAGERNVLITS ALP YVNNVPHLGN I IGCVLSADVFARYS 
RLRQWNTLYLCGTDEYGTATETKAL\EEGLTPQEICDKYHIIHA 
DIY\RWFNISFDIFGRTTTPQQ\TKIT\QDIFQQLLKRGFVLQD 
TVEQLRCTHCARF\LADRFVEGVCPFCGYEEARGDQCDKCX3KLI 

PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDLK\WGNPGTP*E 
G FEDK WFYVWFDAT I GYLS I TANYTDQWERWW\ KNPEQVDLYQ 
FM \ A KDNVP FHS LVFP S S ALGAEDNYTL \VS HL I ATE YLNYE DG 
K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 
FS WTDLLLKNNS \ ELLNNLGNFINRA\GMFVS KFFGG \ YVPEMV 
LT P DDQRLLA\ HVTLELQH YHQ\ LL E KVR I RDALRS ILT I S \ RH 
GNQYI \QVNEPW\KRI KGS EADRQRAGTVTGLAVNI AALLS VML 
Q P YMP T VS AT I QAQLQLP P P ACS I LLTNFLCTL PAGHQ I GT VS P 
L FQKLENDQ I E S LRQR FGGGQAKTS P KP AWETVTTAKP QQ I QA 
LMDE VT KQGNI VRELKAQ KAD KNE VAAE VAKLLDL KKQLAVAEG 
KPPEAPKGKKKK 


5896 


2967 


86 


HPS LLG AI PFYPPPSSPWP P PLYLFWNSHRKS RH F I NQRG IHGE 
MRLFVSDGVPGCLPVLAAAGRARGRAEVLISTVGPEDCWPFLT 
RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDLTNQWLEW 
EATELQPTLSAALYYL\ WQGKKG \ EDVLGS VRRTLTHI DHS LS 
RQ\NCPFLAGETESLADIVLWGALYPLLQDPAYLPEELSALHSW 
FQTLS TQ \ E PCQR \ AARRL VLKQ \QG VLALR \ P YLQ KQ PQP S P A 
EGKGLSPIEPEEEELATLSEEEIAMAVTAWEKGLESLPPLRPQQ 
NPVLPVAGERNVLITSALPYVNNVPHLGNI IGCVLSADVFARYS 
RLRQWNTL YLCGTDE YGTATETKAL \ EEGLTPQE I CDKYHI I HA 
DIY\RWFNISFDIFGRTTTPQQ\TKIT\QDIFQQLLKRGFVLQD 
TVEQLRCEHCARF\LADRFVEGVCPFCGYEEARGDQCDKCGKLI 
NAVELKKPQCKVCRSCPWQSSQHLFLDLPKLEKRLEEWLGRTL 
PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDLK\WGNPGTP*E 
GFEDK\VFYVWFDATIGYLS I T AN Y T DQ WE R WW \ KNPEQVDLYQ 
FM\AKDNVP FHS LVFPSS ALGAEDNYTL \VSHL I ATE YLNYEDG 
K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 
FSWTDLLLKNNS\ELLNNLGNFINRA\GMFVSKFFGG\ YVPEMV 
LTPDDQRLLA\HVTLELQHYHQ\ LLEKVR I RDALRS I LT I S \RH 
GNQYI \QVNEPW\KRIKGSEADRQRAGTVTGLAVNIAALLSVML 
QPYMPTVSATIQAQLQLPPPACSILLTNFLCTLPAGHQIGTVSP 
LFQKLENDQI ES LRQRFGGGQAKTS P KPAWETVTT AKPQQ I QA 
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Amino acid segment containing signal peptide 
(A=Alanine # C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G*Glycine, 
H=Histidine, I-Isoleucine, K*Lysine, 
L=Leucine, M*Methionine, N=Asparagine, 
• P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V-Valine, 
W-Tryptophan, Y-Tyrosine, X= Unknown, +*3top 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMDEVTKQGNIVRELKAQKADKNEVAAEVAKLLDLKKQLAVAEG 
KPPEAPKGKKKK 


5897 


29<?7 


86 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE 
MRLFVSDGVPGCLP VLAAAGRARGRAE VLI STVGPEDCWPFLT 
RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDLTNQWLEW 
EATELQ PTLS AAL YYL\ WQGKKG \ EDVLGS VRRTLTH I DHSLS 
RQ\NCPFLAGETESLADIVLWGALYPLLQDPAYLPEELSALHSW 
FQTLSTQ \ E PCQR \ AARRLVLKQ \ QGVLALR\ PYLQKQPQPS PA 
EGKGLSPIEPEEEELATLSEEEIAMAVTAWEKGLBSLPPLRPQQ 
NP VLPVAGERNVLI TSALPYVNNVPHLGNI IGCVLSADVFARYS 
RLRQWNTLYLCGTDEYGTATETKAL\EEGLTPQEICDKYHIIHA 
D I Y \ RW FN I S FD I FGRTTTP QQ\TKIT\QDI FQQ LLKRG FVLQD 
TVEQLRCEHCARF \LADRFVEGVCP FCG YEEARGDQCDKCGKL I 
NAVELKKPQCKVCRSCPWQSSQHLFLDLPKLEKRLEEWLGRTL 
PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDLK\WGNPGTP*E 
GFEDK\VFYVWFDATIGYLSITANYTDQWERWW\KNPEQVDLYQ 
FM \AKDNVP FHSLVFPSS ALGAEDNYTL \ VSHLIATEYLNYEDG 
K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 
FS WTDLLL KNNS \ELLNNLGNFI NRA\GMFVS KFFGG \ YVPEMV 
LTPDDQRLLA\HVTLELQHYHQ\LLEKVRIRDALRSILTIS\RH 
GNQYI \QVNEPW\KRI KGSEADRQRAGTVTGLAVNIAALLSVML 
QP YMPTVS ATI QAQLQLPPPACS I LLTNFLCTLPAGHQIGTVS P 
LFQKLENDQ I ESLRQRFGGGQAKTS PKPAWETVTTAKPQQ I QA 
LMDEVTKQGNIVRELKAQKADKNEVAAEVAKLLDLKKQLAVAEG 
KPPEAPKGKKKK 


5898 


2967 


86 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE 
MRLFVSDGVPGCLPVLAAAGRARGRAEVL I STVGPEDCWPFLT 
RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDLTNQWLEW 
EATELQPTLSAALYYL\WQGKKG\EDVLGSVRRTLTHIDHSLS 
RQXNCPFIAGETESLADIVLWGAIiYPIiLQDPAYLPEELSALHSW 
FQTLS TQ \ E P CQR \ AARRL VL KQ \ QG VLALR \ P YLQ KQ P Q PS P A 
EGKGLS P IEPEEEELATLSEEE IAMAVTAWEKGLESLPPLRPQQ 
NPVLPVAGERNVLI TSALPYVNNVPHLGNI IGCVLSADVFARYS 
RLRQWNTLYLCGTDEYGTATETKAL\EEGLTPQEICDKYHIIHA 
DIY\RWFNISFDI FGRTTTPQQ \ TKI T\ QD I FQQLLKRG FVLQD 
TVEQLRCEHCARF\LADRFVEGVCPFCGYEEARGDQCDKCGKLI 
NAVELKKPQCKVCRSCPWQSSQHLFLDLPKLEKRLEEWLGRTL 
PGSDWTPNAQFITPFFGFREW PS KPRWQ * TRDLK\WGNPGTP * E 
G FED K\ VFYVW FD AT I G YLS I TAN YTDQ WERWW \ KNPEQ VDL YQ 
F M \ AKDNVPFHS L VF P S S ALG AEDNYTL \ VSHL I ATE YLN YEDG 
K\ FS KS RG VG VFRDM \ AHDTG I P PD I SR FYL \ L YIRP EG K\DS A 
FSWTDLLLKNNS \ELLNNLGN FI NRA\GMFVS KFFGG \ YVPEMV 
LTPDDQRLLA\HVTLELQHYHQ\LLEKVRIRDALRSILTIS\RH 
GNQYI \QVNE P W\KR I KGS EADRQRAGTVTGLAVNIAALLSVML 
Q P YM P TVS AT I QAQLQL P PPACS I LLTNFLCTLP AGHQ I G TVS P 
LFQKLENDQIESLRQRFGGGQAICTSPKPAVVETVTTAKPQQIQA 
LMDEVTKQGNIVRELKAQKADKNEVAAEVAKLLDLKKQLAVAEG 
KPPEAPKGKKKK 


5899 


326 


1078 


NCPKS KE PNGVRAPSLPS PLRAAMALSDVDVKKQIKHMMAFIEQ 
EANEKAEBIDAKAEEEFNIEKGRLVQTQRLKIMEYYEKKEKQIE 
QQKKILMSTMRNQARLKVLRARNDLISDLLSEAKLRLSRIVEDP 
E VYQG LLD KLVLQG LLRLLE P VM I VRCR P \ QDL LLVEAAVQKA I 
PEYMTISQKHVEV\QIDKEA*LAVECSWEVWEVYSGNQRIKVSK 
TLESRLDLSAKQKMPEI RMALFGANTNRKFFI 


5900 


64 


1409 


KAASRDS PCLE FCP LCG VS SHDLQHRMWYHRLSHLHS RLQDLLK 
GG VI YPALPQPNFKS LLPLAVHWHHTAS KSLTCAWQQHEDHFEL 
KYANTVMR FD YVWLRDH CRS AS C YNS KTHQ RSLDTAS VDLC I KP 
KT I RIiDETTLFFTWPDGHVTKYDLNWLVKNS YEGQKQKVIQPR I 
LWNAE I YQQAQVPS VDCQS FLETNEGLKKFLQNFLLYGIAFVEN 
V PPTQEHTE KLAER I SL I RETI YGRMWYFTS DFS RGDTAYTKLA 
LDRHTDTT YFQEPCG I QVFHCLKHEGTGGRTLLVDGFYAAEQVL 
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ID 

NO : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D-Aspartic Acid, E= 
f?l 1 1 1 - a m i n Rrlrt F=Phpnvl an i np G=Glvcine 

vjiuloiiixl. r\K, J. i r — rticiiyj.aJ.auj.iiCi vj— ux y * — LAit - i 

H=Histidine, I«Isoleucine , K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
?=Proline, Q~Glutamine, R»Arginine, 
S^Serine, T«Threonine, V-Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknovm, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QKAPEE FELLS KSAI \KHEYIEDVGECHQPHDWDWAQS* ISTHG 
/ YKELYLI RYNNYDRAVINTVP YDVVHRWYTAHRTLTI ELRRPE 
NE FWVKL KPGR VL F I DNWR VLHGREC FTG YRQLCGCYLTRD D VL 
NTARLLGLQA 


5901 


1 


2121 


VAI EQTS LKMMQAVGGAPARPTGEY I CNQCGAKYTS LDS FQTHL 
KTHLDWLPKLTCPQCNKEFPNQESLLKHVTIHFMITSTYYICE 
SCDKQFTSVDDLQKHLIoDMHTFVFFRCTLCQEVFDSKVSIQLHL 
\ AVKHS NEKKVYRCTS CNWDFRNETDIiQLHVKHNHLENOGKVHK 
CIFCGESFGTEVEI^CHITTHSKKYNCKFCSKAFHAIILLEKHL 
REKHCVFETK1PNCXSTNGASEQVQKEEV 

DGS EEDVDTSEPMYGCDI CGAAYTMETLLQNHQLRDHNIRPGES 
AIVKKKAELIKGNYKCNVCSRTFFSENGLREHMQTHLGPVKHYM 
CPICGERFPSLLTLTEHKVTHSKSLDTGNCRICKMPLQSEEEFL 
EHCQMH PDLRNS LTGFR CVVCMQTVTS TLEL lU. HGT rnMQK. HjIN 
GS AVQTTGRGQHVQKLYKCAS CLKEFRS KQDLVKLDINGLP YGL 
CAG CVNLS KSAS PG INVPPGTNRPGLGQNENLS AI EGKGKVGGL 
KTRCS*LATFKF*VLKVELPEPHPKPFHRGVSRPDSNSTQLKTP 
QVSPMPRISPSQSDE KKTYQC I KCQMVF YNEWD I Q VHVANHM I D 
EGLiVHECKLCSQTFDSPAKLQCHLI EHS FEGMGGTFKCP VCFTV 
FVQANXLQQHIFSAHGQEDKIYDCTQCPQKFFFQTELQNHTMTQ 
HSS 


5902 


712 


209 


LKNRRRSRPS I RQS IGSTSVSRWLTS LFTY LDHTAD VQ * V* REF 
I PLKPRQ* ED *MFQSWLHAWGDTLEEAFEQCAMAMFGYMTDTGT 
VF.PLQTVEVETQGDDLQSLLFHFLDEWLYKFSADEFFI P \GWGE 
firoJjbKJajrUvjIfciVKAl L ibAMyv liNhaJN Jr£»vr V ± xul 


5903 


2106 


735 


DTPGPSLPSTTAPFSLRSLSFPSRPSYLLPGDPQPLQGRGLPTT 
PALFALSAVPGGAASPMPPSGLRLLPLLLPLLWLLVLTPGRPAA 
GLSTCKTIDMELVKRKRIEAIRGQILSKLRLASPPSQGEVPPGP 
LPEAVIjALYNSTRDRVAGESAEPEPEPEADYYAKEVTRVLMVET 
HNEIYDKFKQSTHSIYMFFNTSELREAVPEPVLLSRAELRLLRL 
KLKVEQHVELYQKYSNNSWRYLSNRLLAPSDSPEWLSFDVTGVV 
RQWLSRGGEIEGFRLSAHCSCDSRDNTLQVDINGFTTGR\RGDL 
ATI HGMNRP FLLLMATPLERAQHLQS \SRHRQAL\DTNY\ CFSF 

urTDMPT.Dr 1 AWP*UT.T POVTIT \ nU3\ VUT\UP\ Ptf^VHlNPrVT, 
n\y\yKrt LunV. / V XI w Hjj X f XiMJjj \Kpri \ r>.ri± \tlEt \ " IN.O I fTrtlN r \U 

GPCPY I WSLDTQYSKVLALYNQ\HKPG\ASAAP \ CCVPQALEP\ 
LPIVYY\VGRKPKVEQLSNMIVRSCKCS 


5904 


3 


1126 


MMEEIEIOAINTFKEEQRLIYEELIKEEKTTNNELSAISRKIDTW 
ALGNSETEKAFRAISSKVPVDKVTPSTLPEEVLDFEKFLQQTGG 
RCGAWDDYDHQNFVKVRNKHKGKPTFMEEVLEHLPGKTQDEVQQ 
HEKWYQKFLALEERKKES IQIWKTKKQQKREEI FKLKEKADNTP 
VLFHNKQEDNQKQ KEEQRKKQKLAVEAWKKQKS I EMSMKCASQL 
KEEEEKEKKHQKERQRQFKLKLLLESYTC^KKEQSEFLRLEKEI 
REKAEKAEKRKNAADEI SRFQERDLHKLELKI LDRQAKEDE KSQ 
KQRRLAKLKEKVENNVS RD P S RL Y / NTHQRLGRTNQKDRTNRLW 
ATSTYPT*GYSNLETRNTEKSMR 


5905 


287 


2912 


MASFPPRVNEi^rVRLRTIGELIjAPAAPFDKKCGRENWTVAFAP 
DGS Y FAWSQGHRTVKLVPWSQCLQNFLLHGTKNVTNSSS LRLPR 
QNSDGGQKNKPREHIIDCGDIVWSLAFGSSVPEKQSRCVNIEWH 
RFRFGQDQLLl^TGI^SGRIKJWDVYTGKLLLNLVDHTGVVRDL 
TF AP DGS LI L VS AS RD KTLRVWDLRDDGN\ KM KVLRGHQNWVY\ 
SCAFS PDSSMLCS VGASKAWAAI LV* LRLCWHHSHTGATM VLS 
W AERVAS LATGLGATFTI G * SNLAFVLQGVLYVHRCWSMSTFCF 
SFFLFFFFKVISPTVKYH*LLSKLIFQFYGIGSLTSETNLM*SI 
WLSNG FS VLFFG I LS DSRD I LRL* FNLKFVL I FF * K* CIVS VQK 
KKKPKRIALLQEERLS*DKPPSSHLI*QTEVNIRILFRAILHS* 
LL I FR I * NCI * TYS * I I DP FY I QMT YDRG * FGKNKMVKF * F I EM 
*LYYFHKIAFSFCNVV*HPCCLPKKFHLAVNILFACSICFSS*A 
QVGDPSLL*TSDYLKGRCQWSNNLLTLRFLSVYFFKNLWSGKK 
REGGL* YLTLFIS VYFS* LVFGINGFQYS FWKLHCLYFWFRLI 
FKLTFNRNI *NRICMSALINLKTDFNLTMTLSIFFKLLI IYNA* 
YNLN*I*QF*YKMCHFVLCMSE*SYNICLFIAGF\LWNMDKYTM 
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ID 
NO: 


beginning 
nucleotide 
location 
corresooridi na 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
^Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, I-Isoleucine, K-Lysine, 
L=Leucine, M-Methicnine, N«Asparagine, 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X=Unlcnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5906 






IKKLEGHHHDVVACDFSPDGALLATASYDTRVYIWDPHNGDILM 
EFGHLFPPPTPIFAGGANDRWVRSVSFSHDGLHVASLADDKMVR 
FWRIDEDYPVQVAPLSNGLCCAFSTDGSVLAAGTHDGSVYFWAT 
PRQVPSLQHLCRMSIRRVMPTQEVQELPIPSKLLEFLSYRI 


5907 


146 


2038 


KEGAGSGRMASGAVYNPYIEIIEQPRukUMRKRyKCEGRSAGSI 
PGEHSTDNNRTYPSIQIMNYYGKGKV\RITLVTK\NDPYKPHPH 
DLVGKDCRD \ G YYEAE FGQE \ RRP \ LFFQN \ IjG I RCVKKKEVKE 
A\ 1 1 TR \ I KAG I NP FDVP * KQLND I EDCDLDWRLWFRVFLPDG 
HGNL \ TTALPPV\ VSSP I YDNRAPNTAELRVCR VNKNCGS VRGG 
DEIFLLCDKVQKDDIEVRFVLNDWEAKGIFSQADVHRQVAIVFK 
TPPYCKAITEPVTVKMQLRRPSDQEVSESMDFRYLPDEKDTYGN 
KAKKQKTTLLFQKLCQDHVETGFRHVDQDGLELLTSGDPPTLAS 
QS AG I TVN FP E R PRPG LLGS I GEGR YFKKE PNL FSHDA WREM P 

TGVSSQAESYYPSPGPISSGLSHHASMAPLPSSSWSSVAHPTPR 
SGNTN PL S S FS TRTLP SN SQGIPPFLRI P VGNDLNASNAC I YNN 
ADD I VGMEAS S M PS AD L YG I SDPNMLS NCS VNMMTTS S DS MGET 
DNPRLLSMNLENPSCNSVLDPRDLRQLHQMSSSSMSAGANSNTT 
VFVSQSDAFEGSDFSCADNSMINESGPSNSTNPNSHVFVQDSQY 
SG IGSMQNEQLSDS FP YEFFQV 


5908 


99 


1873 


TYhhSSViSS * *hlLl)'l t }<l KSQVKV / RKGHKKl *> WP YPQPAKQNGK 
KATSKVPSAPHFVHPNDHANREAELKKKWVEEMREKQQAAREQE 
RQKRRTIESYCQDVLRRQEEFEHKEEVLQELNMFPQLDDEATRK 
A Y YKEFRKWE YS D V I L E VLD ARD PLGCR CFQME EAVLRAQGNK 
KLVLVLNKIDLVPKEWEKWLDYLRNELPTVAFKASTQHQVKNL 
NRCS VP VDQASE S LLKS KAC FGAENLMR VLGNYCRLGEVRTHI R 
VGWGLPNVGKSSLINSLKRSRACSVGAVPGITKFMQEVYLDKF 
IRLLDAPGIVPGPNSEVGTILRNCVHVQKLADPVTPVETILQRC 
NLEEISNYYGVSGFQTTEHFLTAVAHRLGKKKKGGLYSQEQAAK 
AVLADWVSGKISFYIPPPATHTLPTHLSAEIVKEMTEVFDIEDT 
EQANEDTMECLATGESDELLGDTDPLEMEIKLLHSPMTKIADAI 
ENKTTVYKIGDLTGYCTNPNRHQMGWAKRNVDHRPKSNSMVDVC 

SVDRRSVI^RIMETDPI^QGQAI^ALKNTQCKMQKRADKIASKL 
SDSMMSALDLSGNADDGVGD 


5909 


247 


975 


HCG I KKRGEGs G5 P 5 PAS GG KULG CQ I PB PS LP S KE ETH PHTRA 
HTRTLRATLTRRP PRSHS TRLR FPMP L DGDGGLAS WK/ PMRER * 
GWRRPAKAAGASLGVAATGKRGCRWSKRYLQKATKGKLLI 1 1 FI 
VTLWGKWSSANHHKAHHVKTGTCEWALHRCCNKNKrEERSQT 
VKCSCFPGQVAGTTRAAPSCVDASIVEQKWWCHMQPCLEGEECK 
VLPDRKGWSCSSGNKVKTTRVTH 




1 


5002 

I 


PAI PGSTI I WA^USHS AARADGRHGS LPSQSQAPGALCGARAPP 
S SNLRADRS M I CAQARAGKNL YHNRFLGLAAMAFP SRNSQS LRR 
CKEPIRYSYNPDQFHNMDLRGGPHDGVTIPRSTSbTDLVTSDSR 
S TLMGRS S YYS I GHSQDL V I HWD I KEE VDAGDW I GM YL I DE VLS 
ENFLD YKNRGVNGS HRGQ 1 1 WKI DAS S Y FVEPETKI CF KYYHG V 
SGALRATTPSVTVKNSAAPIFKSIGADETVQGCX3SRRLISFSLS 
D FQAMG LKKGMFFNPD P YL K I S I QPGKHS I F PAL PHHG QERRS K 

IIGNTVNPIWQAEQFSFVSLPTDVLEIEVKDKFAKSRPIIKRFL 

GKLSMPVQRLLERHAIGDRWSYTLGRRLPTDHVSGQLQFRFEI 

TSSIHPDDEEISLSTEPESAQIQDSPMNNLMESGSGEPRSEAPE 

SSESWKPEQLGEGSVPDRPGNQSIELSRPAEEAAVITEAGDQGM 

VSVGPEGAGELLAQVQKDIQPAPSAEELAEQLDLGBEASALLLE 
DGEAPASTKEEPLEERATTnc;R AaPB''PT7T?v-r/^c-c.^T^Tr«>^ 

— ^ *^^-±jiri^r.cr+f\x ivoKALjKtiiilShJvEQEEEGDVSTLiEQG 

EGRLQLRASVKRKSRPCSLPVSELETVIASACGDPETPRTHYIR 
I HT LLHSM PS AQGGS AAEEEDG AE EEST LKDS S E KDGLS E VDT V 
AADPSALEEDREEPEGArPGTAHPGHSGGHFPSLANGAAQDGDT 
HPSTGSESDSSPRQGGDHSCEGCDASCCSPSCYSSSCYSTSCYS 
SSCYSASCYSPSCYNGNRFASHTRFSSVDSAKISESTVFSSQDD 
EEEENSAFESVPDSMQSPELDPESTNGAGPWQDELAAPSGHVER 
3 P EGLES PVAG PSNRREGECP ILHNSQP VSQLPSLRPEHHHYPT 
rDEPLPP^EARlDSHGRVFYVDIIVNRrrTWQRPTAAATPDGMR 
^SGSICXJMEQLNRRYQNIQRTIATERSEEDSGSQSCEQAPAGGG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence ■ 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide * 
{A=Alanine, C=Cysteine, D»Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L-Leucine, M«=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W*Tryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








GGGGS DS EABS SQSSLDLRREGS LS PVNSQ KI TLLLQS P AVKF I 
TN'PEFFTVLHANYSAyRVFTSSTCLKHMILKVRRDARNFERyQH 
NRDLVNFINMFADTRLELPRGWE I KTDQQG KS FFVDHNS RATT F 
I DPR I PLQNGRLPNHLTHRQHLQRLRS YSAGEAS E VSRNRGASL 
LARPGHS LVAAI RSQHQHES LPLAYNDKI VAFLRQPN I FEMLQE 
RQPSLARNHTLREKIHY IRTEGNHGLEKLS CDADLVILLSLFEE 
E IMS YVP LQAAFHPGYS FS PRCS P CSS PQNS PGLQRASARAP SP 
YRRDFEAKLRNFYRKLEAKGFGOGPGKI KLI IRRDHLLEGTFNQ 
VMAYSRKELQRNKLYVTFVGEEGLDYSGPSRBFFFLIjSQELFNP 
YYGLFEYSANDTYTVQISPMSAFVENHLEWFRFSGRILG\LALI 
HQYLLDAFFT\RPFYKALL\RLPC\D\LSDLEYLDEEFHQSLQW 
MKDNN I TDI LDLT FTVNEE VFGQVTERELKSGGANTQVTEKNKK 
EYIERMVKWRVERGWCX}TEALVRGFYEWDSRLVSVFDARELE 
LV I AGTAE IDLNDWRNNTEYRGGYHDGHLVIRWFWAAVERFNNE 
QRLRLLQFVTGTSSVPYEGFAAPPWEPMGLRRFLP * FCKWGKITS 
LPPRG \ HTCLQPDWDLPTVS PRTPMLYE K\ LLTA\ VEETSTFGT 


5910 


1526 


446 


VAEFAAMEPGRTQI KLDPRYTADLLEVbKTNYGI PSACFSQPPT 
AAQLLRAXjGPVELALTSILTLIiALGS IAI FLEDAVYLiYKNTLCP 
I FCRRTLLWKSSAPTWSVLCCFGLWI PRS LVLVEMT I TS FYAVC 
FYLLMLVMVEGFGGKEAVIiRTLRDTPMMVHTGPCCCCCPCCPRL 
LLTRKKIiQ\R*CWALSNTPS*R*R*PWWACFSSPTASMTQQTFL 
RGAQL YGS TLSS A/ CS TLLAL WTLG IIS RQARLHLGEQNMGAKF 
ALFQVLL I LTALQPS I FSVLANGGQ I ACS PPYSS KTRSQVMNCH 
LL ILE T F LMT VLTRMY YRRKDH KVGYE T FS S PDUDLNLKALR WM 
AWTMKGCCTH 


5911 


109 


595 


QLPLAPCIQGKGLEMRSPKPQS FI IRSSHSGAGLLVKNPSTPVF 
CGHRRGGAAFKYKPTPWGPEQRPTGQKHMRGGVSLLSPRIiECS 
GTISAHCNLRLPSSSNS PAPAS * LAGITGVCHHAQLI FVFLVET 
G FHHVGQAGLELL / NWIHL PR P P KVLGLQA 


5912 


924 


277 


M I LNKALMLGALALTTVMS PCGGEDI VADH VAS YG VNL YQS YG P 
SGQYSHE FDGDEE FYVDLERKETVWQLPL FRRFRRFDPQFALTN 
IAVLKHNLNIVIKRSNSTAATNEVPEVTVFSKSPVTLGQPNTLI 
CLVDNIFPPWNITWLSNGHSVTEGVSETRPSSPKSDHFLLQDQ 
VTS PS FP FE * * DL * TAKVEQLGAWFEPLLKHWGAE I PTTL 


5913 


46 


1198 


QLRMAGAEGAAGRQSELEPWSLVDVLEEDEELENEACAVLGGS 
DSEKCSYSQGSVKRQALYACSTCTPEGEEPAGICLACSYECHGS 
HKLFELYTKRNFRCDCGNSKFKNLECKLLPDKAKVNSGNKYNDN 
FFGLYCICKRPYPDPEDEIPDEMIQCWCEDWFHGRHLGAIPPE 
SGDFQEMVCQACMKRCS FLWAYAAQLAVTKI ST\GMMDWCGTLM 
E * /DDQEVIKPENGBHQDSTLKEDVPEQGKDDVREVKVEQNSEP 
CAGSS SESDLQTVFKNESLNAES KSGCKLQELKAKQL I KKDTAT 
YWPLNWRSKLCTCQDCMKMYGDIJDVLFLTDEYDTVLAYENKGKI 
AQATDRSDPLMDTLS S MNRVQQVELIC/G IQ * FED 


5914 


960 


124 


NLGGSELPPEEALFIQVASMNQRRVDFYLASIEDMLVAI/GGRN 
ENGALSS VET YS PKTDS WS YVAGLPRFTYGHAGTI YKD FVYISG 
GHDYQIGPYRKNIiLCYDHRTDWEERRPMTTARGWHSMCSLGDS 
I YS IGGSDDN I ES MERFDVLGVEAYS PQCNQWTRVAPLLHANS E 
SGVAVWEGRIYILGGYSWENTAFSKTVQVYDREADKWSRGVDLP 
KA I AGGS ACF I AP * S LGQRTRKRiCAKARGTRTGAS DPS CAS WDH 
PHRHLPGLCRPAATS 


5915 


1604 


703 


FPGRPTRPLKLGRRRKRARIIQAPHCHSPRPRTCPPGAIjQAPEA 
PAS RAEG PVAWVNGHTEGPAPARSAP KE P PGLPRPLGS FPCPT 
PQEDFPALGGPCPPRMPPSPGFSAWLLKGTPPPPPPGLVPPIS 
KPPPGFSGLLPSPHP\PVSPAPPPPPPQK/RPRLLPAP/PGLPS 
PRE LPG E E P S AH P VHQG LP AERRG PLQ R VQE PLRG VQTGPDLRS 
PVLQELPGPAGGEFPEGL* *AAGPAAH 


. 5916 


256 


633 


SPRMWEIWGPWHRWESFSLEGEWPSRIPEPSPDSTKGTSGKGCR 
TVTGAVHRHLNHVAGI I PWVLHSQLKPTAATAQDQWTSQQYPDH 
PTRL I LQ * NQATADKNN* TTAIiLQPHQRL\ VS PRMAEA 


5917 


1343 


827 


AHQILTYLEP/ ICLWNYNKILTVFLTKSVLEI *KFIHTPQTYR 
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ID 
NO: 


Predicted 
beginning 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptTHe" 
(Alanine, OCysteine, D=Aspartic Acid, E=> 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K-Lysine, 
L«Leucine, M=Methionine, N«Asparagine, 
P-Proline, Q-Glutaraine, R=Arginine, 
S«Serine, T« Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\*possible nucleotide insertion) 


5918 






K'NDPFGIKEVYVSRRLRKTS^/feLAVTFLKOAWSKECVPVDQ " 
FMEHLLPSLLSLASDPVPNVRVLLAKALRQMLLBKAYFRNAGNP 
HLE V I EET I LALQSDRDQDVS FFAALEP KRRNI I DTAVLE KON 


5919 


13 


1247 


EGAQVARRRSRRQWRAGRCGRGRGGRRAERTGGRGPPGRPRPLP 
PG P ARRGRRRMET P F YGDEALS GLGGGAS GSGGTFAS PGRLFPG 
A? P TAAAGS MMKKDALTLS LS E Q VAAALKP APAP AS Y P PA \ ADG 
APS AAPPDGLLAS PDLGLLKLAS PELERL 1 1 QSNGL VTTTPTS S 
QFLYPKVAASEEQEFAEGFVKALEDLHKQNQLGAGRAAAAAAAA 
AGG PSGTATG SAP PG ELAPAAAAP EAP VYA\NLS S Y \ AGGCRGL 
RGG AAT \ VAFAAE P VP FP PP P P PGALG PRRP / RLALQGRR PQTV 
PDVP\SFGESP\PLSPIET\DTPRRI\KAKRKRL\RNPQIRAPK 
PASRKLGAQSRALERESEDPS * SPEHGSLASTASLLREQVAOLK 
QKVLSHVNSGCQLLPQHQVPAY 


5920 


1 


4254 


■ravUGDSQGTPTSSQGSINMEHWISQAIHGSTTSTTSSSSTQSG 
GS GAAHRLAD VMAQTH I ENHS AP PD VTTYTS EHS I QVER PQQS T 

GSRTAPKYGNAELMETGDGVPVSSRVSAKIQQLVNTLKRPKRPP 
LREFFVDDFEELLEVQQPDPNQPKPEGAQMLAMRGEQLGWTNW 
PPSLEAALQRWGTISPKAPCLTTMDTNGKPLYILTYGKLWTRSM 
KVAYSILHKLGTKQEPMVRPGDRVALVFPNNDPAAFMAAFYGCL 
LAE WP VP I EVP LTR KD AG SQQ I G FLLG S CG VTVALT S DACHKG 

LPKSPTGEIPQFKGWPKLLWFVTESKHLSKPPRDWF\PHIKDAN 
NDTAYIEYKTCK\DGSVLGVTVTRTALLTHCQALTQACGYTEAE 
TIVNVl^FKKDVGLWHGILTSVMNMMHVISIPYSLMKVNPUSWI 
QKVCQYKAKVACVKSRDMHWALVAHRDQRBINLSSLRMLIVADG 
ANPWSISS CDAFLNVFQS KGLRQE VI CPCASS PEALTVAI RRPT 

DDSNQPPGRGVLSMHGLTYGVIRVDSEEKLSVLTVQDVGLVMPG 
AIMCSVKPDGVPQLCRTDEIGELCVCAVATGTSYYGLSGMTKNT 
FEVFAMTSSGAPISEYPFIRTGLLGFVGPGGLVFWGKMDGLMV 
VSGRRHNADDI VATALAVEPMKFVYRGRIAVFS VTVLHDER IVI 
VAEQRPDSTEEDSFQWMSRVLQAIDSIHQVGVYCLALVPANTLP 
KT PLGG I HLS ETKQLFLEG S LHP CNVLMCPHTC VTNLP K PRQKQ 
PEIGPASVMVGNLVSGKRIAQASGRDLGQIEDNDQARKFLFLSE 
VLQWRAQTTP DH I L YTL LNCRGA I ANS LTCVQLHKRAE K I AVML 
MBRGHLQDGDHVALVYP PG IDL IAAFYGCLYAGCVP I TVRPPHP 

QNIATTLPTVKMIVEVSRSACLMTTQLICKLLRSREAAAAVDVR 
TWPLILDTDD*PKKRPAQICKPCNPDTLAYT.DF<5VQTTr'MT ivmr 
KMSHAATSAFCRS I KLQCELYPSREVAI CLDPYCGLGFVLWCLC 
SVYSGHQSILIPPSELETNPALWLLAVSQYKVRDTFCSYSVMEL 
CTKGLGS QTE S LKARGLD LS RVRTCVWAEER PR I ALTQS FS KL 

FKDLGLHPRAVSTSFGCRVNLAICLQGTSGPDPTTVYVDMRAIiR 

HDRVRLVERGS PHS LPLMESGKIL PGVR 1 1 IANPETKGPLGDSH 

LGEIWVHSAHNASGYFTIYGDESLQSDHFNSRLSFGDTQTIWAR 

TGYLGFLRRTELTDANGERHDALYWGALDEAMELRGMRYHPID 

IETSVIRAHKSVTEC^VFTWTOLLVVWELTXSSEQEALDLVPLV 

TNVVLEEHYLIVGVVVVVDIGVIPINSRGEKQRMHLRDGFLADO 
LDPIYVAYNM 


5921 


1381 


1499 


ULGAVAHAGVSKI PP* LFPPLHPTFLSLWCLHHKLP /HPPGASM 
VRP P WP RR P PAH I SS VRQ AS TQVPRTVPHTQR VANI GTQTTGP 
SGVGCCTPGRPLLPCKCSSAAHSTYRVQEPAVHIPGQEPLTASM 
LAAAPLHEQKQMIGERLYPLIHDVHTQLAGKITGMLLEIDNSEL 
LLMLES PES LHAK I DEA VAVLQ AHQAME QP KAYMH 


5922 


727 
2475 


157 

i 

495 < 


v l (j(»t ijLwuyiA»ULiFKETPLKPMDAFTUSGLKRKFDDVDV 
GSSVSNSDDEISSSD S ADS CDS LNPP TTAS FTPTS I LKRQ KQLR 

RKNVRFDQVTVYYFARRC^FTSVPSQGGSSLGMAQRHNSVRSYT 
L CE FAQEQ E VNHRE I LREHLKE EKLHAKKMKLTKNGTVES VEAD 
3LT1J3DVSDEDIDVENVEVDDYFFLQPLPTKRRRALLRASGVHR 
IDAEEKQELRAIRLSREECGCDCRLYCDPEACACSQAGIKCQVD 
RMS F P CGCS RDGCGNMAGR I E FNP I R VRTHYLHT I MKLELES KR 
2\GAAQQPQ\ *GALPDCQLQPDRSTGL *DPSWIGS KGLS FTGKG 
\AATHLI I LR VI EKRGAEGKRK 

a y SNWGLFPS VFIQVPRSRTGNLKPIFLFYS YYE\ CMETLKG \T 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D«Aspartic Acid, E=- 
Glutamic Acid, F= Phenyl alanine, G«Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N*Asparagine, 
P=Proline, Q=Glutamine , R=Arginine, 
S*Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y-Tyrosine, X= Unknown, *eStop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








CLYNATQ YKVCS PRNDR P DACYN P S E P AATTVFE I RTGLLLGDT 
SKII TRTEEKEI PKQI TLRFDACAAINS KKLEIGCGSLN * ERS * 
RVENKYVCHESGVCKNCAYWPCVI *AT*KKNKNDSVYLQKGEAN 
PS CAAGHCNPLELI I TNPLDPHWKKGER VTLG INRTGLKPQ WI 
LI KGEVHKCS PKPVFQTFYEELNLPAPELLKKTKNLFIiQLAENV 
I FLLNGTS CYVRGGTTIGDRWPWEA* ELVPTDPAPD 1 1 PI* KAE 
ASNF* VLKTSI IRQYCIAREGKDFI I PVGKPNCIGQKLYNSTTK 
TIT+*DLNHTEKNPFSKFSKLKTA*AHAESH*DWTVPSGLY*IC 
RHRAYFRLPNKWADSCVIGTI KP S FFLLP I KMGELLGFS VYASR 
E KKG I VIGNWKDNEWP RER I IQ YYGP ATWAQDGS WG YR/ TP /VY 
MLNWI IRLQAILEI ISNETGRALTVLAWQETQMRNAIYQNRLAL 
DYLLVAEGGVCRKF^TNCCLQINDQGQVVKNIVRDMTKLAHVP 
IQ VWHKFDPES LFGKWFPAIGGFKTL I VGVLL V I RTCLLLPCVL 
PLLFQM I KG I VATL VHQ KTS AHVNYMNHYRS I S QRDSKS EDESE 
NSH 


5923 


137 


638 


QLCGRRGQRFRTSIKRMHPI*RTCPNT^/llLLSOENtOlRDL 
QQENRELWI S LEEHQDALELI MS KYRKQMLQLMVAKKAVDAE PV 
LKAHQSHS AE I ESQI DRI CEMGEVMRKAVQ VDDDQ FCKI QEKLA 
QLELENKELRELLSISSESLQARKENSMDTASQAIK 


5924 


274 


2146 


EKG KVKDAGAEQWI S LS LSCKGSWETQFSNHLNS LT PPTS VRRM 
PL I TTVTLLKMVARHHMKLLCS KAFS TQLQQKI FLHSQMG I HHQ 
SVOIKLKPNTSHI IS ILMGQPMALVQLETLAPLTI I IQKFQTQD 
HMKFW KNLPLHSHHLTPS VPQTVI PKKTGS PE I KLK I TKT I QNG 
RELFESSLCGDLLNEVQASE\Q*NQSIESRKEKRKKSNKHDSSR 
SEERKSHKIP KLEP E EQNRPNER VDT VS E KPRE E P VLKEGS P S S 
ANT I FCSNNGS VHW \ FKFQVGDLVWS KVGTYPWWPCMVSS DPQL 
EVHTKINTRGAREYHVQFFSNQPERAWVHEKRVREYKGHKQYEE 
LLAEATKQASNHSEKQKIRKPRPQRERAQWDIGIAHAEKALKMT 
REERIEQYTFIYIDKQPEEALSQAKKSVASKTEVKKTRRPRSVL 
NTQPEQTNAGEVASSLSSTEIRRHSQRRHTSAEEEEPPPVKIAW 
KTAAARKS LPAS I TMHKGS LDLQKCNMS PWKIEQVFALQNATG 
DGKFI DQFVYSTKG IGNKTE I S VRGQDRLI ISTPNQRNEKPTQS 
VSS PEATSGSTGS VE KKQQRRS I RTRS ES EKSTEWPKKKI KKE 
QVGFLHVES 


5925 


216 


1911 


MMTAESREATGLSPQAAQEKDGIVIVKVEEEDEEDHMWGQDSTL 
QDTPPPDPE I FRQR FRR FCYQNT FGPR EALS RLKE LCHQWLRP E 
INTKEQILELLVLEQFLSILPKELQVWLQEYRPDSGEEAVTLLE 
DLELDLSGQQVPGQVHG PEMLARGMVPLDP VQES S S FDLHHEAT 
QSHFKHSSRKPRLIjQSRALPAAHIPAPPHEGSPRDQAMASALFT 

adsqamvki edmavs li leewgcqnlarrnlsrdnrqenygsaf 
pqggenrneneestskaetsedsasrgbttgrsqkefgekrdoe 
gktgerqqknpeektrkekrdsgpaigkdkktitgergprekgk 
glgrsfslssnfttpeevptgtkshrcdecgkcftrssslirhk 
iihtgekpyecsecgkaf\slns\nlvlhqri\htgekphecne 
cgkafshssnlilhqrihsgekpyecnecgkafsqssd\ltkhq 
rihtgekpyecsecgkafnrnsylilhrrvhtrekpykctkcgk 
\aftrsstltlhhrihareraseyspasldafgaflkscv 


5926 


2 


233 


DRCLMLKQGS Q PGS P PAT / CE PPAPP VYQAPCQSC PE P PGAHEP 
SDSPHHTPVHPPPEHSAACPAPATCCPPPRSSMS 


5927 


4146 


1248 


KHFS KFGSQALYQLKRPASGQNS IS VMPAQKITKPAAKYGI PLA 
YKKYGDKKLHEKKPLQKHKQAHQTPEKRVWTGEERRKISEEAAR 
KRRLEFIEKEKKQKDQIISLMKAEQMI<RQEKERLERINPJu^C<3 
WRNVLSAGGSGSVKAPFLGSGGTI APS S FS SRGQ YEHYHAI FDQ 
MQQQRAEDNEAKWKRE I YGRGLPERQ KGQLAVERAKQVEEFLQR 
KREAMQNKARAEGHMGILQNLAAMYGGRPSSSRGGKPRNKEEEV 
YLARLRQIRLQNFNERQQIKAKLRGEKKEANHSEGQEGSEEADM 
RRKK\ I ESLKAHANARAAVLICEQLERKRKEAYEREKKVWEEHLV 
AKGVKSSD VS PPLGQHETGGS PSKQQMRS VI S VTSALKE VGVDS 
SLTDTRETS E EMQKTNNA I S S KR E I LR RLNENLKAQED E KG KQN 
LSDTFE I NVHEDAKEHEKEKS VSS DRKKWEAGGQLVIPLDELTL 
DTS FSTTERHTVGEVI KLGPNGS PRRAWGKSPTDSVLKI LGEAE 
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ID 
NO: 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q-Glutamine, R»Arginine, 
S=Serine, T=Threonine, V-Valine, 
W-Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQLQTELLENTTIRSEISPEGEKYKPLITGEKKVQClSHEINPS 
AIVDSPVETKSPEFSEASPQMSLKLEGNLEEPDDLETEILQEPS 
GTNKDE\SLPCTITDVWrSEEKETKETQSADRITIQENE7SEDG 
VSSTVDQLSDIHIEPGTNDSQHSKCDVDKSVQPEPFFHKWHSE 
HLNLVPQVQSVQCSPEESFAFRSHSHLPPKNXNKNSLLIGLSTG 
LFDANNPKMLRTCS LPDLS KLFRTLMDVPTVGDVRQDNLE IDE I 
EDENIKEGPSDSEDIVFEETDTDLQELQASMEQLLREQPGEEYS 
EEEESVLKNSDVEPTANGTDVADEDDNPSSESALNEEWHSDNSD 
GEIASECECDSVFNHLEELRLHLEQEMGFEKFFEVYEKIKAIHE 
DEDENIEICSKIVQNILGNEHQHLYAKILHLVMADGAYQEDNDE 


5928 


4146 


1248 


KHFSKFGSQ AL YQLKR PASGQNS I S VMPAQKI TKPAAKYGI PLA 
YKKYGDKKLHEKKPLQKHKQAHQTPEKRVNTGEERRKISEEAAR 
KRRLEFIEKEKKQKDQI ISLMKAEQMKRQEKERLERINRAREQG 
WRNVLSAGGSGEVKAPFLGSGGTIAPSSFSSRGQYEHYHAIFDQ 
MQQQRAEDNEAKWKREIYGRGLPERQKGQLAVERAKQVEEFLQR 
KREAMQNKARAEGHMGILQNIiAAMYGGRPSSSRGGKPRNKEEEV 
YLARLRQIRLQNFNERQQIKAKLRGEKFOSANHSEGQEGSEEADM 
RRKK\ IESLKAHANARAAVLKEQLERKRiCEAYEREKKVWEEHLV 
AKGVKSSDVS P PLGQHETGGS PSKQQMRSVI S VTS ALKEVGVDS 
SLTDTRETSEEMQKTNNAISSKREILRRLNENLKAQEDBKGKQN 
LSDTFE INVHEDAKEHEKEKS VSS DRKKWEAGGQLVI PLDELTL 
DTS FS TTERHT VGE V I KLG PNGS PRRA WG K S PT D q VT ■ K T T .r F a f 
LQLQTE LLENTT IRSEIS PEGEKYKPL I TGE KKVQCI SHE INPS 
AIVDSPVETKSPEFSEAS PQMSLKLEGNLEEPDDLETEILQEPS 
GTNKDE \ SLPCT I TD\/WI SEEKETKETQS ADRI TI QENEVS EDG 
VS S T VDQLSD IH I EPGTNDSQHS KCD VD KS VQ PE P FFHKWHS E 
HLNLVPQVQSVQCSPEESFAFRSHSHLPPKKKNKNSLLIGLSTG 
LF1DANNPKMLRTCS LPDLS KLFRTLMDVPTVGDVRQDNLE I DEI 
EDENIKEGPSDSEDIVFEETDTDLQELQASMEQLLREQPGEEYS 
EEEES VLKNSDVE PTANGTDVADEDDNPS SESALNEEWHSDNSD 

DEDENIE I CS KI VQNI LGNEHQHLYAKI LHL VMADGAYQEDNDE 


5929 


3 1 


1558 


LDFSMTTQLPAYVAILLFYVSRASCQDTFTAAVYEHAAILPNAT 
LTPVSREEALALMNRNLDI LEGAITSAADOGAHI I VTPEDA TVf? 
WNFNRDSLYPYLEDIPDPEVNWIPCNNRNRFGQTPVQERLSCL\ 
AKNNSIYWANIGDKKPCDTSDPQCPPDGRYQYNTDWF\DSQG 
KLVARYHKQNLFMGENQFNVP KE PE I VTFNTTFGS FG I FTCFD I 
LFHDPAVTLVKD FHVDT I VF PTAWMNVL PHLS AVE FHS AWAMGM 
RVNFLASNIHYPSKKMTGSGIYAPNSSRAFHYDMKTEEGKLLLS 
QLDSHPSHSAWNWTS YASS I EALSSGNKE FKGTVFFDEFTFVK 
LTGVAGNYTVCQKDLCCHLS YKMSENI PNEVYALGAFDGLHTVE 
GRYYI<JICTLIJCCKTTNLNTCGDSAETASTRFEMFSLSGTFGTQ 
YVF PEVLL S ENQLAPG E FQVS TDGRLFS LKPTSGP VLTVTL FGR 
LYEKDWASNASSGLTAQARIIMLIVIAPIVCSLSW 


5930 


113 


6082 


RGNCFWIVPFTMAQRTGLEDPERYLFVDRAVIYNPATQADWTAK 
KLVW IPS ERHGFEAAS I KEERGDE VMVE LAENGKKAMVNKDD I Q 
i^PPKFSXVEDMAELTCLNEASVLHNLKDRYYSGLIYTYSGLF 
CWINPYKNLPIYSENIIEMYRGKKRHEMPPHIYAISESAYRCM 
LQDREDQS ILCTGESGAGKTENTKKVIQYLAHVASSHKGRKDHN 
I PGE \LERQLLQANP I LESFGNARTVQNDNS S R FG KF I RINFDV 
TGYIVGANIETYLLEKSRAVRQAKDERTFHIFYQLLSG\AGEHL 
KSDLLLEGFNNYRFLSNGYIPIPGQ\QDKGNFRGDPGEAMHIMG 
FSHEEILSMLKWSSVLQFGNISFKKERNTDQASMPENTVAQKL 
CHLLGMNVMEFTRAILTPRIKVGRDYVQKAQTKEQADFAVEALA 
KATYERLFRWLVHRINKALDRTKRQGASFIGILDIAGFEIFELN 
SFEQLCINYTNEKLQQLFNHTMFILEQEEYQREGIEWNFIDFGL 
DLQPCIDLIERPANPPGVLALIiDEECWFPKATDKTFVEKLVQEQ 
GSHSKFQKPRQLKDKADFCIIHYAGKVDYKADEWLMKNMDPLND 
NVATLLHQSSDRFVAELWKDVDRIVGLDQVTGMTETAFGSAYKT 
KKGMFRrVGQLYKESLTKLMATLRNTNPNFVRCI IPNHEKRAGK 
LDPHLVLDQLRCNGVLEGIRICRQGFPNRIVFQEFRQRYEILTP 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S^Serine, TsThreonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








NAI PKGFMDGKQACERM IRALELDPNLYRIGQSKI FFRAGVLAH 
LEEERDLKITDI I 1 FFQAVCRG YLARKAFAXKQQQLSALKVLQR 
NCAAYLJCLRHWQWWRVFTKVKPLLQVTRQEEELQAKDEELLKVK 
EKQTKVEGELEEMERKHQQLLEEKNILAEQLQAETELFAEAEEM 
RARLAAXKQELEEILHDLESRVEEEEERNQILQNEKKKMQAHIQ 
DLEEQLDEEEGARQKLQLEKVTAEAKIKKMEEEILLLEDQNSKF 
I KE KKLMEDRIAECS S QLAEEEE KAKNLAKI RNKQEVM I SDLEE 
RLKKEEFCTRQELEKAKRKLDGETTDLQDQIAELQAQIDELKLQL 
AKKEEELQGALARGDDETLHKNNALKWRELQAQIAELQEDFES 
EKASRNKAEKQKRDLSEELEALKTELEDTLDTTAAQQELRTKRE 
QEVAELKKALEEETKNHEAQIQDMRQRHATALEELSEQLEQAKR 
FKANLEKNKQGLETDNKEIJVCEVKVLQQVKAESEHKKKKLDAQV 
QELHAKVS EGDRLR VE LAE KAS KLQNE LDNVSTLLEEAE KKG I K 
FAKDAASLESQLQDTQELLQEETRQKLNLSSRIRQLEEEKNSLQ 
EQQE EEEEARKNLEKQ VLAIiQSQLADTKKKVDDDLGT I ES LEEA 
KKKLLKDAE ALS QRLE E KALAYDKLE KTKNRLQQELDDLTVDLD 
HQRQVASNLEKKQ\KKFDQLLAEEKSISARYAEERDRAEAEARE 
KETKALSLARALEEALEAKEEFERQNKQLRADMEDLMSSKDDVG 
KNVHELEKSKRALEQQV\EEMRTQLEELEDELQATEDAKLRLEV 
NMQAMKAQFERDLQTRDEQNEEKKRLLIKQVRELEAELEDERKQ 
RALAVAS KKKME 1 DLKDLE AQI EAAN KARDEV I KQLRKLQAQMK 
DYQRELEEARASRDEIFAQSKESEKKLKSLEAEILQLQEELASS 
ERARRHAEQERDELADE I TNS ASGKS ALLDEKRRLEAR I AQLEE 
ELEEEQSNMELLNDRFRKTTLQVDTLNAELAAERSAAQKSDNAR 
QQLERQNKELKAKLQELEGAVKSKFKATISALEAKIGQLEEQLE 
QSAKERAAANXLVRRTEKKLKEIFMQVEDERRHADQYKEQMEKA 
NARMKQLKRQLEEAEEEATRANASRRKLQRELDDATEANEGLSR 
EVSTLKNRLRRGGPISFSSSRSGRRQLHLEGASLELSDDDTESK 
TSDVNETQPPQSE 


5931 


113 


6082 


RGNCFWIVPKi'MAQRTGLEDPERYLFVDRAVIYNPATQADWTAK 
KLVWIPSERHGFEAASIKEERGDEVMVELAFJIGKKAMVNKDDIQ 
KMNPPKFS KVEDMAELTCLNEAS VLHNLKDRYYSGLI YTYSGLF 
CVVTNPYKNLPIYSENIIEMYRGKKRHEMPPHIYAISESAYRCM 
LQDREDQSILCTGESGAGKTENTKKVIQYIiAHVA5SHKGRKDHN 
I PGE\LERQLIiQANP I LESFGNARTVQNDNS SRFGKFIRINFDV 
TGYIVGANIETYLLEKSRAVRQAKDERTFHIFYQLLSG\AGEHL 
KSDLLLBGFNNYRFLSNGYI P I PGQ\QDKGNFRGDPGEAMHIMG 
FSHEEILSMLKWSSVLQFGNI S FKKERNTDQASMPENTVAQKL 
CHLLGMNVME FTRAI LTPR I KVGRDYVQKAQTKEQADFAVEALA 
KATYERLFRWLVHRINKALDRTKRQGASFIGILDIAGFEIFBLN 
SFEQLCINYTNEKLQQLFNHTMFILEQEEYQREGIEWNFIDFGL 
DLQPCIDLIERPANPPGVLALLDEECWFPKATDKTFVEKLVQEQ 
GSHSKFQKPRQLKDKADFCIIHYAGKVDYKADEWLMKNMDPIjND 
NVATLLHQS S DRFVAE LWKDVDR I VGLDQ VTGMTE TAFGS AYKT 
KKGMFRTVGQLYKESLTKLMATLRNTNPNFVRCI I PNHEKRAGK 
LD PHLVLDQLR CNGVLEG I RICRQGFPNR I VFQEFRQR YE ILTP 
NAI PKGFMDGKQACERMIRALELDPNLYRIGQSKI FFRAGVLAH 
LEEERDLKITDI 1 1 FFQAVCRGYLARKAFAKKQQQLSALKVLQR 
NCAAYLKLRHWQWWRVFTKVKPLLQVTRQEEELQAKDEELLKVK 
EKQTKVEGELEEMERKHQQLLEEKNILAEQLQAETELFAEAEEM 
RARLAAKKQELEEILHDLESRVEEEEERNQILQNEKKKMQAHIQ 
DLEEQLDEEEGARQKLQLEKVTAEAXIKKMEEEILLLEDQNSKF 
IKE KKLMEDR I AE CS S Q LAE EEE KAKNLA KI RNKQEVN I S DLEE 
RL KKE E KTRQ ELE KAKR KLDGETTDLQDQ I AELQAQ I D E LKLQL 
AKK^EELC^AIARGDDETLHKNNALKVVRELQAQIAELQEDFES 
EKASRNKAEKQKRDLSEELEALKTELEDTLDTTAAQQELRTKRE 
QEVAELKKALEEETKNHEAQIQDMRQRHATALEELSEQLEQAKR 
FKANLEKNKQGLETDNKELACEVKVLQQVKAESEHKRKKLDAQV 
QELHAKVSEGDRLRVELAEKASKLQNELDNVSTLLEEAEKKGIK 
FAKDAAS LESQLQDTQELLQEETRQKLNLSS R I RQLEEE KNS LQ 
EQQEEEEE ARKNLE KQVLALQSQIJUOrKKKVDDDLGTIES LEEA 
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to first 
amino acid 
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amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E-» 
Glutamic Acid, F=Phenyl alanine, G-Glycine, 
H=Histidine, I-Isoleucine , K-Lysine, 
L-Leucine, M-Methionine, N=Asparagine , 
P-Proline, Q»Glutamine, R=Arginine, 
S«Serine, T^Threonine , V»Valine, 
VUTryptophan, Y«Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 








KKKLLKDAEALSQRLEEKALAYDKLEKTKNRLQQELDDLTVDLD 
HQRQVASNLE KKQ \ KKFDQLLAEEKS I SARYAEERDRAEAEARE 
KET KALS IiARALEEALEAKEE FE RQNKQLRADMEDLMS S KDD VG 
KNVHELEKSKRALEQQV\EEMRTQLEELEDELQATEDAKLRLEV 
NMQ AMKAQ F E RDLQTRDE QNE E KKRLL I KQVRELEAE LEDERKQ 
RALAVAS KKKMEIDLKDLEAQ I EAANKARDE VIKQLRKLQAQMK 
DYQRELEEARASRDEIFAQSKESEKKLKSLEAEILQLQEELASS 
ERARRHAEQERDELADE I TNSASGKSALLDEKRRLEARIAQLBE 
ELEEEQSNMELLNDRFRKTTLQVDTLMAELAAERSAAQKSDNAR 
QQL ERQNKE L KAKLQEL EG AVKS K FKAT I S ALE AKI GQL E EQLE 
QEAKERAAANKLVRRTE KKLKE I FMQVEDERRHADQ YKEQME KA 
NARMKQLKRQLEEAEEEATRANASRRKLQRELDDATEANEGLSR 
EVSTLKNRLRRGGPISFSSSRSGRRQLHLEGASLELSDDDTESK 
TSDVNETQPPQSE 


5932 


33 


572 


RHLEE I CFLFLQKGRKLKLSGPRWEEGKPRGTGGLW VKAEANMG 
FGATLAVGLT I FVLS WT 1 1 ICFTCS CCCLYKTCRRPRPV\APP 
PKPP/PVVHAPYPQPPSVPPSYPGPSYQGYHTMPPC.PGMPAAPY 
FMQYPPPYPAQPMGPPAYHETLAGGAAAPYPASQPPYNPAYMDA 
PKAAL 


5933 


1 


3190 


GTRKLKMADKTPGGSQKASSKTRSSDVHSSGSSDAHMDASGPSD 
SDM P SRTRPKS PRKHNYRNESARESLCDS PHQNLSRPLLENKLK 
AFSIGKMSTAKRTLSKKEQEELKKKEDEKAAAEIYEEFLAAFEG 
SDGNKVKTFVRGGVVNAAKEEHETDEKRGKIYKPSSRFADQKNP 
PNQSSNERPPSLLVIETKKPPLKKGEKEKKKSNLELFKEELKQI 
QEERDERHKTKGRLSRFEPPQSDSDGQRRSMDAPSRRNRSSGVL 
DDYAPGSHDVGDPSTT\NFYLGNI\NPQMNLKKCCCQEFGRFGP 
LASVKIMWPRTDEERARERNCGFVAFMNRRDAERALKNLNGKMI 
MSFEMKLGWGKAVPIPPHPIYIPPSMMEHTLPPPPSGLPFNAQP 
RERLKNPNAPMLPPPKNKEDFEKTLSQAIVKWIPrERNLLALI 
HRMI EFWREGPMFEAMIMNRE INNPMFRFLFENQTPAHVYYRW 
KLYS ILQGDS PTKWRTEDFRMFKNGS FWRPPPLNPYLKGMSEEQ 
ETEAFVEEPS KKGALKEEQRDKLEE I LRGLTPRKND IGDAMVFC 
LNNAEAAEEI VDCITESLS I LKTPLPKKIARLYLVSDVLYNSSA 
KVANASYYRKFFETKLCQIFSDLNATYRTIQGHLQSE^FKQRVM 
TCFRAWEDWAIYPEPFLIKLQNIFLGLVNIIEEKErEEVPDDLD 
GAP I EEELDGAPLEDVDG I P IDATPI DDLDGVP I KSLDDDLDGV 
P LDATEDS KKNEP I FKVAPS KWEAVDES ELEAQAVTTSKWELFD 
QHEESEEEENQNQEEESEDEEDTQSSKSEEHHLYSNPIKEEMTE 
S KFS KYS EMS EE KRAKLRE I ELKVMKFQDELESGKRPKKPGQS F 
QEQVEHYRDKLLQREKEKELERERERDKKDKEKLESRSKDKKEK 
DECTPTRKERKRRHSTSPSPSRSSSGRRVKSPSPKSERSERSER 
S HKESSRS RSSHKDS PRDVS KKAKRS PSGSRTPKRSRRSRSRS P 
KKSGKKSRSQSRSPHRSHKKSKGKTNTGRKFFKKAVTYWKCDLF 
LCPERSVF 


5934 


1 


3190 


gtrklkmadktpgg^sOkassktrssdvhssgssdahmdasgpsd 
sdmpsrtrpksprkhnyrnesareslcdsphqnlsrpllenklk 
afsigkmstakrtlskkeqeelk3ckedekaaaeiyeeflaafeg 

SDGNKVKTFVRGGVVNAAKEEHETDEKRGKIYKPSSRFADQKNP 
PNQSSNERPPSLLVIErKKPPLKKGEKEKKKSNLELFKEELKQI 
QEERDERHKTKGRLSRFEPPQSDSDGQRRSMDAPSRRNRSSGVL 
DDYAPGSHDVGDPSTT\NFYLGNI\NPQMNLKKCCCQEFGRFGP 
LASVKIMWPRTDEERARERNCGFVAFMNRRDAERALKNLNGKMI 
MS FEMKLGWGKAVP IPPHP I YI PPSMMEHTLPF PPSGLPFNAQP 
RERLKNPNAPMLPPPKNKEDFEKTLSQAIVKWIPTERNIiLALI 
HRMI EFWREGPMFEAMI MNREINNPMFRFLFENQTPAHVYYRW 
KLYS ILQGDS PTKWRT5DFRMFKNGSFWRPPPLNPYLHGMSEEQ 
ETEAFVE E PS KKGALKEEQRDKLEE I LRGLTPRKND I GDAMVFC 
LNNAEAAEEI VDCITESLS I LKTPLPKKIARLYLVSDVLYNSSA 
KVANAS YYRKFFETKLCQI FS DLNAT YRT I QGHLQS ENFKQRVM 
TCFRAWEDWAIYPEPFLIKLQNIFLGLVNIIEEKETEDVPDDLD 
GAP I EEELDGAPLEDVDG I P I DATP IDDLDGVP I KSLDDDLDGV 
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Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLDATEDS KKNEP I FKVAPS KWRAVDE S E LEAQAVTTS KWELFD 
QHEESEEEENQNQEEESEDEEDTQSSKSEEHHLYSNPIKEEMTE 
SKFSKYSEMSEEKRAKLREIELKVMKFQDELESGKRPKKPGQSF 
QEQVEHYRDKLLQREKEKELERERERDKKDKEKLESRSKDKKEK 
DECTPTRKERKRRHSTSPSPSRSSSGRRVKSPSPKSERSERSER 
SHKESSRSRSSHKDSPRDVSKKAKRSPSGSRTPKRSRRSRSRSP 
KKSGKKSRSQSRSPHRSHKKSKGKTNTGRKFFKKAVTYWKCDLF 
LCPERSVF 


5935 


3 


4493 

• 


SYWLSGWRLSRPPRQFWAGWRGIGRFGTMAPVHGDDCEIGASAL 
S DSGS FVS SRARREKKS KKGRQEALERLKKAKAGERYKYE VEDF 
TGVYE EVDEEQ YS KLVQARQDDDW I VDDDG IG YVEDGRE I FDDD 
LEDDALDADEKGKDGKARNKDKRNVKKLAVTKPNNIKSMFIACA 
GKKTADKAVDLS KDGLLGDI LQDLNTET PQ ITP PPVM I LKKKRS 
IGAS PNPFS VHTATAVPSGK IAS P VS RKE P PLT P VPLKRAEFAG 
DDVQVESTEEEQESGAMEFEDGDFDEPMEVEEVDLEPMAAKAWD 
KESEPAEEVKQEADSGKGTVSYLGSFLPDVSCWDIDQEGDSSFS 
VQEVQVDSSHLPLVKGADEEQVFHFYWLDAYEDQYNQPGWFLF 
GKVWI ESAETHVS CCVMVKNIERTL YFLPREMK IDLNTGKETGT 
P I S MKDVYEE FDEKIATKYKIMKFKS KPVE KNYAFE I PDVPEKS 
EYLEVKYSAEMPQLPQD1.KGETFSHVFGTNTSSLELFLMNRKIK 
GPCWLEVKKS TALNQP VS WCKVEAMALKPDLVNV I KDVS P P PLV 
VMAFSMKTMQNAKNHQNE 1 1 AMAALVHHSFALDKAAPKP PFQSH 
FCWSKPKDCI FPYAFKEVI EKKNVKVEVAATERTLLGFFLAKV 
HKIDPDIIVGHNIYGFELEVLLQRINVCKAPHWSKIGRIjKRSNM 
FKLGGRSGFGERNATCGRMICDVEISAFCELIRCKSYHLSELVQQ 
ILKTERWIPMENIQNMYSESSQLLYLIiEHTWKDA\KFILQIMC 
ELNVL P LALQ I TN I AGN I MS RTLMGGRS ERNE FL LLHAF YENNY 
IVPDKQIFRKPQQKLGDEDEEIDGDTNKYKKGRKKGAYAGGLVL 
DPKVGFYDKFILLLDFNSLYPSIIQEFNICFTTVQRVASEAQKV 
TEDGEQSQIPELPDPSLEMGILPREIRKLVERRKQVKQLMKQQD 
LNPDLI LQYD IRQKALKLTANSMYGCLGFSYSRFYAKPLAALVT 
YKGREIIiMHTKEMVQKMNLEVIYGDTDSIMINTNSTNLEEVFKL 
GNKVKS EVNKL YKLLE I D I DG VFKSLLLLKKKKYAAL WE PTSD 
GNYVTKQELKGLDIVRRDWCDtiAKDTGNFVIGQILSDQSRDTIV 
ENIQKRLIEIGENVLNGSVPVSQFEINKALTKDPQDYPDKKSLP 
HVHVALW I NS QGGRKVKAGDTVS YV I CQDGSNLTAS QRA YAP EQ 
LQKQDNLTIDTQYYLAQQIHPWARICEPIDGIDAVLIATGWEL 
\DPTQFKVHHYHKDEENDALLGGPAQLTDEEKYRDCERFKCPCP 
TCGTENI YDNVFDGSGTDMEPSLYRCSNIDCKAS PLTFTVQLSN 
KLIMD I RRFI KKY YDGWL I GEE PTCRNRTRHLPLQ FSRTT3PLCP 
ACMKATLQ P E YSDKS LYTQLCFYR YI FDAECALE KLTTDHEKDK 
LKKQFFTPKVLQDYRKLKNTAEQFLSRSGYSEVNLS KLFAGCAV 
KS 


5936 


1124 


139 


RGEEQFDAEFRRFACIiGFGERLQEFSRLLRAVHRSRAWTCYLAI 
RMLMATCCPS PTTTACTG P WQRAPPLRLL VQKRRADS SGLAFAS 
NSLQRRKKGLLLRPVAPLRTRPPLLISLPQDFRQVSSVIDVDLL 
PE THRR VRLHKHG S DR PLG FY IRDGMS VR VAPQG \ LERVPG I FI 
S RLVRGGLAESTGLLAVSDE I LE VNG I E VAGKTLNQVTDMMVAN 
S HN \ LI VT VKP ANQRNNWRGAS GRLTGP PS AGPG PAE P DSDDD 
SSDLVIENRQPPSSNGLSQGPPCMDLHPGCRHPGTRSSLPSLDD 
QEQASSGWGSRIRGDGSGFSL 


5937 


31 


1600 


PTSI^KSTVQIJ^CRLLQDKRYQCVYSLAEIFKVLASFYVILVIL 
YGLTSSYSLWWMLRSSLKQYSFEALREKSNYSDIPDVKNDFAFI 
LHLADQYDPLYSKRFS I FLS E VS ENKL KQ I NLNNE WTVE KLKS K 
LVKNAQDKI ELHLFMLNGLPDNVFELTEMEVLSLELI PEVICLPS 
AVSQLVNLKELRVYHSSLWDHPALAFLEENLKILRLKFTEMGK 
I P R WVFHL KNLKE L YLS GCVL P E QLSTMQ L EG FQDLKNLRTL YL 
KS S LSR I PQVVTDLLPS LQKLS LDNEGS KL VVL»NNLKKMVNLKS 
LELISCDLERIPHSIFSLNNLHELDLRENNLKTVEEIISFQHLQ 
NliSCLKLWHNNI AYI PAQIGALSNLEQLSLDHNNI ENLPLQLFL 
CTKLHYLDLS YNHLTF I P EE I QYL \SNLQY FAVTNNNI EMLPDG 
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L FQCKKLQ CLLLGKNS LMNLS PHVG E LS NLTHREP I G \NYLETL 
P PELEGCQS LKRNCL I VE ENLLNTLPL P VTERLQTCLDKC 


5938 


395 


1865 


YKGEGFFCNQEARGERRKKKKAMSSPNIWSTGSSVYSTPVFSQK 
^^^ILLLLSLYPGFTSQKSDDDYEDYASlnCTWVLTPKVPEGDV 
TVILNNLLEGYDNKLRPDIGVKPTLIHTDMYVNSIGPVNAINME 
YT I D I FFAQTWYDRRLKFNST I K VLRLNSNM VGKI W I PDT FFRN 
S KKADAHW I TT PNRM LR I WNDGR VL YS LRLT I D AECQLQLHN F P 
MDEHSCPLEFSSYGYPREEIVYQWKRSSVEVGDTRSWRLYQFSF 
VGLRNTTEWKTTSGDYWMSVYFDLSRRMGYFTIQTYI PCTL I 
WLSWVSFWINKDAVPARTSLGITTVLTMTTLSTIARKSLPKVS 
YVTAMDLF VS VCFI FVFS ALVE YG \TLHYFVSNRKPS KDKDKKK 
KNPAPTID I RPRS AT I QMNNATHLQERDEE YGYECLDGKDCAS F 
FCCFEDCRTGAWRHGRIHIRIAKMDSYARIFFPTAFCLFNLVYW 
VSYLYL 


5939 


66 


1404 


IRPGYLKEVQENSPGHRAGLEPFFDFIVSINGSRLNK£)NDTLKD 
LLKANVEKPVKMLIYSSKTLELRETSVTPSNLWGGQGLLGVSIR 
FCSFMANENVWHVLEVESNSPAALAGLRPHSDYIIGADTVMNE 
S E DL FS L I ETHE AKPLKL YVYNTDTDN CRE V 1 1 TPNS AWGGEGS 
LGCG I G YG YLHR I PTR PFEEGKKIS L PGQMAGT P I TPLKDG FTE 
VQLSSVNPPSLSPPGTTGIEQSLTGLSISSTP\PAVSSVLSTGV 
PTVP \LLP PQVNQSLTS VPPMES S YLHLPGLMP FTRQGLPNLPQ 
PSTFNLPR\PTHSWPGVGLYQEFVKPGVLPPLSSMPPRNLPG\I 
APLPLPSEFLPSFPLVPESSSAASSGELLSSLPPTSNAPSDPAT 
TTAKADAASSLTVDVTPPTAKAPTTVEDRVGDSTPVSEKPVSAA 
VDANASESP 


5940 


145 


717 


RR S AS RS AS PRQ S AGTAVTTG TRAGGTCLAAAHHRMRW RADGRS 
LEKLPVHMGLVITEVEQEPSFSDIASLWWCMAVGISYISVYDH 
QG I FKRNNSRLMDE ILKOQQELLGLDCSKYS PE FANSNDKDDQV 
LNCHlAVKVIiSPEDGKADIVRAAQDFCQLVAQKQKRPTDLDVDT 
LA\VYLVQMWLILI 


5941 


13 


6147 


MCLGRMGASSPRSPEPVGPPAPGLPFCCGGSLLAVWXiLALPVA 

WGQCNAPEW\LPFARPTNLTDEFEFPIGTYLNYECRPGYSGRPF 

SIICLKNSVWTGAKDRCRRKSCRNPPDPVNGMVHVIKGIQFGSQ 

IKYSCTKGYRLIGSSSATCr ISGDTVIWDNETP ICDRIPCGLPP 

TITNGDFISTNRENFHYGSWTYRCNPGSGGRKVFEIiVGEPSIY 

CTSNDDQVGIWSGPAPQCI I PNKCTP PNVENG I LVS DNRSLFS L 

NEWE FRCQPGFVMKGPRRVKCQALNKWE PELPSCSRVCQPPPD 

VLHAERTQRDKDNFSPGQEVFYSCEPGYDLRGAASMRCTPQGDW 

S PAAPTCE VKS CDDFMGQLLMGRVLFPVNLQLGAKVDFVCDEGF 

QLKGSSASYCVLAGMESLWNSSVPVCEQIFCPSPPVIPNGRHTG 

KPLEVFPFGKAVNYTCDPHPDRGTSFDLIGESTIRCTSDPQGNG 

VWSSPAPRCGILGHCQAPDHFLFAKLKTQTNASDFPIGTSLKYE 

CRPEYYGRPFSITOLDNLVWSSPKDVCKRKSCKTPPDPVNGMVH 

V I TD I Q VG S R INYS CTTGHRL IGHSS AEC I LSGNAAHWS TKP P I 

CQRIPCGLPPTIANGDFISTNRENFHYGSWTYRCNPGSGGRKV 

FELVGKPSIYCTSNDDQVGIWSGPAPQCI I PNKCTP PNVENG I L 

VSDNRSLFSLNEWEFRCQPGFVMKGPRRVKCQALNKWEPELPS 

CS R VCQ P P PD VLHAERTQRDKDN FS PGQE VF YS CE PG YDLRGAA 

SMRCTPQGDWSPAAPTCEVKSCDDFMGQLLNGRVLFPVNljQLGA 

KVDFVCDEGFQLKGSS ASYCVLAGMES LWNS S VPVCEQ I FCPS P 

PVIPNGRHTGKPLEVFPFGKAVNYTCDPHPDRGTSFDL1GESTI 

RCTSDPQGNGVWS SPAPRCGI LGHCQAPDHFLFAKLKTQTNASD 

FPIGTSLKYECRPEYYGRPFSITCLDNLVWSSPKDVCKRKSCKT 

PPDPVNGMVHVITDIQVGSRINYSCTTGHRLIGHSSAECILSGN 

TAHWSTKP P I CQR I P CGLP PT I ANGDF I S TNRENFH YGS WTYR 

CNLGSRGRKVFELVGEPSIYCTSNDDQVGIWSGPAPQCIIPNKC 

T P PNVENG I L VSDNRS L FSLNEWE FRCQPG FVMKG PRRVKCQA 

LNKWEPELPSCSRVCQPPPEILHGEHTPSHQDNFSPGQEVFYSC 

EPGYDLRGAASLHCTPQGDWSPEAPRCAVKSCDDFLGQLPHGRV 

LF P LNLQ LG AKVS FVCD EGFRLKG S S VSHCVLVGMRS LWNNS VP 

VCEHIFCPNPPAILNGRHTGTPSGDIPYGKEISYTCDPHPDRGM 
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W-Tryptophan, Y«Tyrosine, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








T FNL IGEST I RCTS D PHGNGVWS S PAPRCELS VRAGHCKTPEQF 
P FAS PTI P I NDFE FP VGTSLN YECRPGYFGKMFS I SCLENLVWS 
S VEDNCRRKS CGPP PE PFNGMVH INTDTQFGSTVNYS CNEGFRL 
IGS PSTTCLVSGNNVTWDKKAP I CE I I SCE P P PTI SNGD FYSNN 
PTC rWMflTWTYnrHTC PTVWOT.FFT iVf3FR <? T YCTS KDDOVGVW 
SS P PPRCI STNKCTAPE VENAI RVPGNRS FFSLTE I IRFRCQPG 
FVMVGSHTVQCQTNGRWGPKLPHCSRVCQP PPE I LHGEHTLSHQ 
DNFS PGQEVFYSCEPS YDLRGAASLHCTPQGDWS PEAPRCTVKS 
CDDFLGQL PHGRVLLPLNLQLGAKVS FVCDEG FRLKGRS ASHCV 
LAGM KALWNS S VPVCEQ I FCPNPPAI LNGRHTGTPLGD I PYGKE 
VSYTCDPHPDRGMTFNLIGESTIRRTSEPHGNGVWSSPAPRCEL 
PVGAACPHP PK IQNGHY IGGHVS LYLPGMT I S YTCDPG YLIjVGK 
GF I FCTDQG I WSQLDHYCKE VNCS FP L FMNG I SKE LEMKKVYHY 
GDYVTLKCEDGYTLEGSPWSQCQADDRWDPPLAKCTSRTHDALI 
VGTLSGTIFFILLIIFLSWIILKKRKGNNAHENPKEVAIHLHSQ 
GGSSVHPRTLQTNEENSRVLP 


5942 


4509 


688 


YLYVRMRANPLAYGISHKAYQIDPPL\RKHREQ\LVIE\VGRKL 
DK\AQMIRFEERTGYFSSTDLGRTASHYYIKYNTIETFNELFDA 
HKTEGDI FAI VSKAEEFDQI KVREEE I EELDTLLSNFCBLSTPG 
GVENSYGKINILLQTYINRGEMDSFSLISDSAYVAQNAARIVRA 
LFE IALRKRWPTMTYRLLNIiSKAIDKRLWGWAS PLRQFS ILPPH 
MLTRLEE KKLTVDKLKDMRKDE IGHI LHHVN I GLKVKQCVHQ I P 
S VMMEAF I QP I TRTVLRVTLS I YADFTWNDQVKGTVGE P WW I WV 
EDPTNDHI YHSEYFLALKKQVI SKEAQLLVFTI PI FEPLPSQYY 
IRAVSDRWLGAEAVCI INFQHLILPERHPPHTELLDLQPLP ITA 
LGCKAYEALYNFSHFNPVQTQ I FHTLYHTDCNVLLGAPTGSG KT 
VAAE LAI FR VFN KY PTS KAVY I APLKAL VRERMDD W KVR I E E KL 
GKKVI ELTGDVTPDMKS I AKADLIVTTPEKWDGVSRSWQNRNYV 
QQVTILIIDEIHLLGEERGPVLEVIVSRTNFISSHTEKPVRIVG 
LSTALANARDLADWLNIKQMGLFNFRPSVRPVPLEVHIQGFPGQ 
HYCPRMASMNKPAFQAIRSHSPAKPVLIFVSSRRQTRLTALELI 
AFLATEEDPKQWLNMDEREMEN 1 1 ATVRDSNLKLTLAFG I GMHH 
AGLHERDRKTVEELFVNCKVQVLIATSTLAWGVNFPAHLVI IKG 
TE YYDGKTRRYVDFP I TDVLQMMGRAGRPQFDDQGKAVILVHDI 
v vnTrvv v T?r .vrdpovt? q ct .t .<z nuT.w&p t arwr t t q vnn at .r> 

KJVUr X i^JVf i-i I atrC r V CiOoJuiA? YJjoUriiJii/VD irtUu ill O Ur\±JU 

Y I TWTY F FRRL IMN P S YYNLGD VS HD S VNKFLSH LIEKSLIELE 
LS YCIE IGEDNRS I EPLTYGRI AS YYYLKHQTVKMFKDRLKPEC 
STEELLS ILSDAEEYTDLPVRHNEDHMNSEIiAKCLP IESKPHSF 
DSPHTKAHLLLQAHliSRAMLPCPDYDTDTKTVIJXJALRVCQAML 
DVAANO^WLVTVLNITNLIQMVIQGRWLKDSSLLTLPNIENHHL 
HLFKKWKPIMKGPHARGRTSIECLPELIHACGGKDHVFSSMVES 
ELHAAKTKQAWNFLSHLPEINVGISVKGSWDDLVEGHNELSVST 
LTADKRDDNKW I KLHADQEYVLQ VS LQR VH FGFHKG KPE S CAVT 
PRFPKS KDEGWFLI LGEVDKREL IALKRVG YI RNHHVAS LS F YT 
PEIPGRYIYTLYFMSDCYLGLDQQYD/NLSQRYTSESFCTGQHQ 
GL 


5943 


1 


2274 


DKP TRH KT YLS S S W AKMAAAEG P VGDGE LWQT WLPNHWFLRLR 
EGLKNQS PTEAEKPAS S SLPSS P P PQLLTRNWFGLGGELFLWD 
GEDS S FLWRLRGP SGGG \EE PALSQYQRLLCINPPLFE I YQVL 
LSPTQHHVALIGIKGLMVLELPKRWGKNSEFEGGKSTVNCSTTP 
VAERFFTSSTSLTLKHAAWYPSEILDPHWLLTSDNVIRIYSLR 
E PQTPTNVI I LSEAEE ES LVLNKGRAYTAS LGETAVAFDFGPLA 
AVPKTLFGQNGKDEWAYPLYI LYENGETFLTY I S LLHS PGN/l 
WKAVGS IAHAS \ AAEDNYGYDACAVLCLPCVPNI LV I ATE SGML 
YHCWLEGEEEDDHTSEKSWDSRIDLIPSLYVFECVBLELALKL 
ASGEDDPFDSDFSCPVKLHRDPKCPSRYHCTHEAGVHSVGLTWI 
HKLHKFLGSDEEDKDSLQELSTBQKCFVEKILCTKPLPCRQPAP 
IRGFWIVPDII/3PTMICITSTYECLIWPLLSTVHPASPPLLCTR 
EDVEVAESPLRVLAETPDSFEKHIRSILiQRSVANPAFLKASEKD 
IAPPPEECLQLLSRATQVFREQYILKQDLAKEEIQRRVKLLCDQ 
KKKQLEDLSYCREERKSLREMAERLADKYEEAKEKQEDIMNRMK 
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Codon, /-possible nucleotide deletion, 
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KLLHS FHSEL P VLSDSERDMKKELQL r PDQLRHLGNAI KQVTMK 

KDYQQQKMEKVLSLPKPTIILSAYQRKCIQSILKEEGEHIREMV 
KQINDIRNHVNF 


5944 


167 


342B 


FSIATFTDEPEVLTEPPSATTTTTIGISATVn'TLAGSHGKRNNT 
ITTTSSKRKNRKNKITPENVQIIFDDPLPISYSQPEKVNGESKS 
SSTSESGDSDNMRISSCSDESSNSNSSRKSDNHSPAWTTTVSS 
FCKQPSVLVTPPKEERKSVSGKASIKLSETISEGTSNSLSTCTKS 
GPSPLSSPNGKL7VASPKRGQKREEGWKEWRRSKKVSVPSTVI 
S R V I GRGGCN I NAI REFTG AH I D I D KQKDKTGDR 1 1 T I RGGTE S 
TRQATQLINALIKDPDKEIDELIPKNRLKSSSANSKIGSSAPTT 
TAANTS LMG I KMTTVALS S TS QTATALTVPAI S S AS THKT I KNP 
VN\NVRPGFPVSFP\LAYPPPQFAHALLAAQTFQQIRPPRLPMT 
HFGGTF P PAQS TWGPFPVR PLS PARATNS PFCPHMVPRHSNQNS S 
GSQWSAGSLTSSPTTTTSSSASTVPGTSTNGSPSSPSVRRQLF 
VTVVKTSNATTTTVTTTASNNNTAPTNATYPMPTAKEHYP VS S P 
SSPSPPAQPGGVSRNSPLDCGSASPNKVASSSEQEAGSPPVVET 
TNTRPPNSSSSSGSSSAHSNQQQPPGSVSQEPRPPLQQSQVPPP 
EVRMTVPPLATSSAPVAVPSTAPVTYPMPQTPMGCPQPTPKMET 
PAIRPPPHGTTAPHKNSASVQNSSVAVLSVNHIKRPHSVPSSVQ 
LPSTLSTQSACQNSVHPANKPIAPNFSAPLPFGPFSTLFENSPT 
SAHAFWGGSWSSQSTPESMLSGKSSYLPNSDPLiHQSDTSKAPG 
FRPPLQRPAPSPSGIVNMDSPYGSVTPSSTHLGNFASNISGGQM 
YGPGAPLGGAPAAANFNRQHFSPLSLLTPCSSASNDSSAQSVSS 
GVRAPSPAPSSVPLGSEKPSNVSQDRKVPVPIGTERSARIRQTG 
TS APS VIGSNLSTS VGHSG IWS FEGIGGNQDKVDWCNPGMGNPM 
I HRPMS DPG VFSQHQAMERDS TG I VTP SGTFHQHVPAG YMDFP K 
VGGMPFS VYGNAM I P P VAPI PDGAGGP I FNG PHAADPS WNSL I K 
MVSSSTENNGPQTVWTGPWAPHMNSVHMNQLG 


5945 


1461 


197 


GVTHLFLFGKRKLRNG IAEDLKGQADFFFLLVSE AWATGS PRA 
WLTCLILPLPGIIFSVLPKAMSRPLLITFTPATDPSDLWKDGQQ 
QPQPEKPESTLDGAAARAFYEALIGDESSAPDSQRSQTEPARER 
KRKFCRRIMKAPAAEAVAEGASGRHGQGRSLEAEDKMTHRILRAA 
QEGDLPELRRLLEPHEAGGAGGNINARDAFWWTPLMCAARAGQG 
AAVSYLLGRGAAWVGVCELSGRDAAQLAEEAGFPEVARMVRESH 
0 E TRS P ENRS PTPS LQ YC ENCDTHFQDSNHRTS TAHLLS LSQGP 
Q P PNLPLG VP I SS PG FKLLLRGGWE PGMGLG PRGEGRANP I PTV 
LKRDQEGLGYRS APQ PRVTH FP AWDTRA VAGRE \ TP PRVATLS W 
REERRREE \ KDRAWERDLRTYMNLEF 


5946 


541 


1666 


ILGSYSSIQPEEYS\SWC\EWLQDLLA\YVSPK\HSYLRDLP 
SEGS PQRVNS IDFV\ EL \ EHLQPD VLVHAVLRWDF / TI LTEAV 
YSYRGQKQKKVMLTVEQAQDQHYALVLWGPGAAW\YPQLQRKKG 
YIWEFKYLFVQCNYTLENLELHTTPWSSCECLFDDDIRAITFKA 
KFQKSAPS FVKISDLATHLEDKCSGWLIKAQI SELAFPITASQ 
KIALNAHS S LKS I FSSLPNI VYTGCAKCX3LELETDENRI YKQCF 
SCLPFTMKKIYYRPALMTAIDGRHDVCIRVESKLIEKILLNISA 
DCLNRVI VPS S E I TYGMWADLFHSLLAVSAE PCVLKI QSLFVL 
D ENS Y PLQQD FS L LDFY P D I VKHGANARL 


5947 


3 


1317 


RG I PDRRRRGP IGRVNMDLENKVKKMGLGKEQGFGAPCLKCKEK 
CEG FELHF WRK I CRNC\NVAKKSM /T VLLSNEEDRKVGKLFEDT 
KYTTL 1 AKLKSDGI PMYKRNVMI LTNP VAAKKNVS INTVTYEWA 
P PVQNQALARQ YMQML P KE KQ P VAGS EGAQ YRKKQLAKQLPAHD 
QDPSKCHELSPREVKEMEQFVKKYKSEALGVGDVKLPCEMDAQG 
PKQMNIPGGDRSTPAAVGAMEDKSAEHKRTQYSCYCCKLSMKEG 
DPAIYAERAGYDKLWHPACFVCSTCHELLVDMIYFWKNEKLYCG 
RHYCDSEKPRCAGCDELI FSNEYTQAENQNWHLKHFCCFDCDS I 
LAGE I YVMVNDKP VCKPCYVKNHA WCQGCHNAI DPEVQRVTYN 
NFSWHASTECFLCSCCSKCLIGQKFMPVEGMVFCSVECKKRMS 


5948 


39 


3370 


YRERYPVSGGSVLRSALEVCWDFLSGLTEGSLLPEGFFSGPIDQ" 
GNHYQMRRKGRCHRGSAARHPSSPCSVKHSPTRETLTYAQAQRM 
VEIEI EGRLHR ISIFDPLEII LEDDLTAQEMS ECNSNKENS ERP 
PVCLRTKRHKNNRVKKKNEALPSAKGTPASASALPEPKVRIVEY 
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S»Serine, T=Threonine # V=Valine, 
W=Tryptophan, Y*Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion} 








SPPSAPRRPPVYYKFIEKSAEELDNEVEYDMDEEDYAWLEIVNE 
KRKGDCVPAVSQSMFEFLMDRFEKESHCENQKQGEQQSLIDEDA 
VCCICMDGECQNSNVILFCDMCNLAVHQECYGVPYIPEGQWLC/ 
RAHCLQSRARPADCVLCPNKGGAFKKTDDDRWGHV\ VCALW\ I P 
E \ VG F ANTVF I E P I DG VRNI P PARWKLT \ CNL CK E KGR / VGAC I 
QCHKANCYTAFHVTCAQKAGLYMKMEPVKELTGGGTTFSVRKTA 
YCDVHTPPGCTRRPLNIYGDVEMKNGVCRKESSVKTVRSTSKVR 
KKAKKAKKALAE PCAVL P T VCAP Y I P PQRLNR I ANQ VAI QRKKQ 
FVERAHSYWLLKRLSRNGAPLLRRLQSSLQSQRSSQQRENDEEM 
KAAKEKLKYWQRLRHDLERARLLI ELLRKREKLKREQVKVEQVA 
MELRLTPLTVLLRSVLDQLQDKDPARIFAQPVSLKEVPDYLDHI 
KHPMDFATMRKRIiEAQGYKNLHEFEEDFDLIIDNCMKYNARDTV 
FYRAAVRLRDQGGVVLRQARREVDSIGLEEASGMHLPERPAAAP 
RRPFSWEDVDRLLDPANRAHLGLEEQLRELLDMLDLTCAMKSSG 
SRSKRAKLLKKE IALLRNKLSQQHSQPLPTGPGLEGFEEDGAAL 
GPEAGEEVLPRLETLLQPRKRSRSTCGDSEVEEESPGKRLDAGL 
TNGFGGARSEQEPGGGLGRKATPRRRCASESSISSSNSPLCDSS 
FNAPKCGRGKPALVRRHTLEDRS EL I S CI ENGNYAKAAR I AAEV 
GQSSMWISTDAAASVLEPLKVVWAKCSGYPSYPALIIDPKMPRV 
PGHHNGVTIPAPPLDVLKIGEHMQTKSDEKLFLVLFFDNKRSWQ 
WLPKSKMVPLGIDETIDKLKMMEGRNSSIRKAVRIAFDRAMNHL 
SRVHGE PTSDLS DID 


5949 


39 


3370 


YRERYPVSGGSVLRSALEVCWDFLSGLTEGSLLPEGFFSGPIDQ " 
GNHYQMRRKGRCHRGSAARHPSSPCSVKHSPTRETLTYAQAQRM 
VEI E I EGRLHR I S I FDPLE 1 1 LEDDLTAQEMS ECNSNKENS ERP 
PVCLRTKRHKNNRVKKKNEALPSAHGTPASASALPEPKVRIVEY 
SPPSAPRRPPVYYKFIEKSAEELDNEVEYDMDEEDYAWLEIVNE 
KRKGDCVPAVSQSMFEFLMDRFEKESHCENQKQGEQQSLIDEDA 
VCCICMDGECQNSNVILFCDMCNLAVHQECYGVPYIPEGQWLC/ 
RAHCLQSRARPADCVLCPNKGGAFKKTDDDRWGHV \VCALW \ I P 
E\VGFANTVFIEP IDG VRNI PPARWKLT\CNLCKEKGR/VGACI 
QCHKANCYTAFHVTCAQKAGLYMKME PVKELTGGGTTFSVRKTA 
YCDVHTPPGCTRRPLNIYGDVEMKNGVCRKESSVKTVRSTSKVR 
KKAKKAKKAIAEPCAVLPTVCAPYIPPQRLNRIANQVAIQRKKQ 
F VERAHS YWLLKRLS RNGAP LLRRLQ S S LQ S QRS SQQRENDE EM 
KAAKEKLKYWQRLRHDLERARLLI ELLRKREKLKREQVKVEQVA 
MELRLTPLTVLLRS VLDQLQDKDPAR I FAQPVSLKEVPDYLDHI 
KHPMDFATMRKRLEAQGYKNLHEFEEDFDLIIDNCMKYNARDTV 
FYRAAVRLRDQGG WLRQARREVDS IGLEEASGMHL PERPAAAP 
RRPFSWEDVDRLLDPANRAHLGLEEQLRELLDMLDLTCAMKSSG 
S RS KRAKLLKKEIALLRNkLSQQHSQPL PTG PGLEG FEEDGAAL 
GPEAGEEVLPRLETLLQPRKRSRSTCGDSEVEEESPGKRLDAGL 
TNGFGGARS EQEPGGGLGRKATPRRRCAS ESS1SS SNS PLCDS S 
FNAP KCX3RGKPALVRRHTLEDRSEL ISC I ENGNYAKAAR I AAEV 
GQSSMWISTDAAASVLEPLKWWAKCSGYPSYPALIIDPKMPRV 
PGHHNGVTIPAPPLDVLKIGEHMQTKSDEKLFLVLFFDNKRSWQ 
WLPKS KMVPLG I DET I DKLKMMEGRNSS I RKAVR IAFDRAMNHL 
SRVHGEPTSDLSDID 


5950 


1166 


373 


E S RS L TMS TSQ PG AC P CQGAAS RP A I LYALLS S S LKAVPRPRSR ' 
CLCRQHRPVQLCAPHRTCREALDVLAKTVAFLRNLPSFWQLPPQ 
DQRRL LQGCWG P LFL LGLAQDAVT FE VAEAP VPS I LKK I LLEEP 
SSSGGSGQLPDRPQPSLAAVQWLQCCLESFWSLELSPKE\YACL 
KG P I L FNP D VPGLQAAS H I GHLQQEAHWVLCE VLEPWC P AAQGR 
LTRVLLTASTLKS I PTSLLGDLFFRP I IGDVDI AGLLGDMLLLR 


5951 


143 


5449 


WNVKPSLLWQLFKFSDKEEHEQNDSISGKTGETGVEEMIATRK 
VEQDS KSTVKLSHEDDH I LE DAG S S D I SS DAACTNPNKT ENS LV 
GLPSCVDEVTECNLELKDTMGIADKTENTLERNKI EPLGYCEDA 
ESNRQLESTEFNKSNLE WDTST FG PESN ILENAI CDVPDQNSK 
QLNAI ESTKI ESHETANLQDDRNSQS SSVSYLESKSVKSKHTKP 
VIHSKQNMTTDAPKXIVAAKYEVIHSKTKVNVKSVKRNTDVPES 
QQNFHRPVKVRKKQ IDKE P KI QSCNSG VKS VKNQAHS VLKKTLQ 
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Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








DQTLVQ I FKPLTHSLS DKSHAHPG<!XKE PHkPAQTGHVSHS SQK 
QCHKPQQQAPAMKTNSHVKEELEHPGVEHFKEEDKLKLKKPEKN 
LQPRQRRSS KS FS LDE P PLF I PDNI AT IRREGSDHSS S FES KYM 
WTPSKQCGFCKKPHGNRFMVGCGRCDDWFHGDCVGLSLSQAQOM 
GEEDKEYVCVKCCAEEDKKTE I LDPDTLENQATVE FHSGDKTME 
CEKLGLSKHTTNDRTKYIDDTVKHKVKILKRESGEGRNSSDCRD 
NE I KKWQLAPLRKMGQ PVLPRRS SEEKSEKI PKESTTVTCTGEK 
ASKPGTHEKQEMKKKKV\EKGVLNVHPAASASKPSADQIRQSVR 
H S LKD I LM KRLTDSNLKVPE EKAAKVATK I E KEL FS F FR DTDAK 
YKNKYRSLMFNLKDPKNNILFKKVLKGEVTPDHLIRMSPEELAS 
KELAAWRRRENRHTI EMI EKEQREVERRPITKITHKGEI EIESD 
APMKEQEAAME I QEPAANKSLEKPEGSEK\RKEEVDSMS KBITS 
QHRQHLFDLNCKICIGRMAPPVDDLSPKKVKWVGVARKHSDNE 
AES I ADALS S TSNILAS E FFEEE KQ ES PKST FS PAPR PEMPGTV 
EVESTFLARLNFIWKGFINMPSVAKFVTKAYPVSGSPEYLTEDL 
PDS I QVGGR I S PQTWDYVEKIKASGTKE I CWRFTPVTEEDQ I 
S YTLLFAYFSSRKRYGVAANNMKQVKDMYLI PLGATDKI PHPLV 
P FDG PGLELHRPNLLLGL I IRQKLKRQHS ACASTSH I AETPES A 
PPIALPPDKKSKIEVSTEEAPEEENDFFNSFTTVLHKQRNKPQQ 
NLQEDLPTAVEPLMEVTKQEPPKPLRFLPGVLIGWENQPTTLEL 
ANKP L P VDD I LQS LLGTTGQVYDQ \ AQS VME QNTVKE I PFLNEQ 
TNS K I S KTDNVE VTDGEN KE I KVKVDN I S E S TD KSAE I E TSWG 
SSS ISAGSLTSLSLRGKPPDVSTEAFLTNLS IQSKQEETVESKE 
KTLKRQbQEDQENNLQDNQTSNSSPCRSNVGKGNIDGNVSCSEN 
LVANTARS PQF INLKRD PRQAAGRSQPVTTSES KDGDS CRNGEK 
HMLPGLSHNKEHLTEQ INVEEKLCS AEKNS CVQQSDNLKVAQNS 
PSVENIQTSQAEQAKPLQEDILMQNIETVHPFRRGSAVATSHFE 
VGNTCPSEFPSKSITFTSRSTSPRTSTNFSPMRPQQPNLQHL.KS 
SPPGFPFPGPPNFPPQSMFGFPPHLPPPLLPPPGFG\FA\QNPM 
VPWPPW\HLP\GQPQRMMGPLSQASRYIGPQNFYQVKDIRRPE 
RRKSDPWGRQDQOQLDRPFNRGKGDRQRFYSDSHHLKRERHEKE 
WEQESERHRRRDRSQDKDRDRKSREEGHKDKERARLSHGDRGTD 
GKASRDSRNVDKKPDKPKSEDYEKDKEREKSKHREGEKDRDRYH 
KDRDHTDRTKSKR 


5952 


3226 


639 


PPARRS ARDLPRALSMEAAR PSGS WNGALCRLL \ LVTL \ AFL I F 
ASDACKWTLHVPSKLDAEKLVGRVNLKECFTAANLIKSSDPDF 
QILEDGSVYTTNTILLSSEKRSFTILLSNTENQEKKKIFVFLEH 
QTKVLKKRHTKE KVLRRAKRRWAP I PCSMLENSLG PFPLFLQQV 
QS DTAQNYT I YYS I RGPGVDQEPRNLFYVERDTGNLYCTRP VDR 
EQYES FE I IAFATTPDGYTPELPLPL 1 1 KIEDENDNYP I FTEET 
YTFTI FENCRVGTTVGQVCATDKDEPDTMHTRLKYS I IGQVPPS 
PTLFSMHPTTGVITTTSSQLDRELIDKYQLKIKVQDMDGQYFGL 
QTTSTC 1 1 N I DD VNDHL P T FTRTS YVTS VEENTVDVE ILRVTVE 
DKDLVNTANVmANYTILKGNENGNFKIVTDAKTNEGVL(^KPL 
NYEEKQQMILQIGVVNEAPFSREASPRSAMSTATVTVNVEDQDE 
GPECNPPIQTVRMKENAEVGTTSNGYKAYDPETRSSSGIRYKKL 
TDPTGWVT I DENTGS I KVFRS LDREAET I KNG I YN I TVLAS DQG 
GRTCTGTLGI ILQDVNDNSPFI PKKTVI ICKPTMSSAEIVAVDP 
DEPIHGPPFDFSLESSTSBVQRMWRLKAINDTAARLSYQNDPPF 
GS YWP irVRCRLGMSS VTSLDVTLCDCITENDCTHRVDPR IGG 
GG VQLGKWAI LAILLG IALFFC I LFTLVCGASGTS KQ PKVI PDD 
LAO^NLIVSNTEAPGDDKVYSANGFTTQTVGASAQGVCGTVGSG 
I KNGGQETI EMVKGGHQTSESCRGAGHHHTLDSCRGGHTEVDNC 
RYTYSEWHS FTQPRLGEES I RGHTLIKN 


5953 


330 


Bll 


P L LCN PD PG W YWWVKQES E I S KESQEMDARPKLDLGFKBGQT I K 
LCIGN ITNKKGGAS KPRTARGGGLSLLP P PPGGKVT I P PPS S / V 
KLPSTNHVTPPS I PKSNHGGSDADILLDLDSPAP VTTPAPTPVS 
VSNDLWGDFSTAS SS VPNOAPQPSNWVQF 


5954 


32 


2130 


P P PPP PK1ANMADLEAVLADVS YLMAME KS KATP AARAS KR I VL 
PSPSIRSVMQKYLAERNEITFDKIFNQKIGFLLFKDFCLNEINE 
AVPQVKFYEEIKEYEKLDNEEDRLCRSRQIYDAYIMKELLSCSH 
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LaLeucine, M=*Methionine , N^Asparagine, 
P=Proline, Q^Glutamine , R=Arginine, 
S=Serine, T=Threonine, V=»Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PFSKQAVEHVQSHLSKJCQVTSTLFQPYIEE ICESLRGDI FQKFM 
ESDKFTRFCQWKNVELNIHLTKNEFSVHRIIGRGGFGEVYGCRK 
ADTGKMYAMKCLNKKRI KMKQGETLALNER IMLSLVSTGDCPFI 
VCMTYAFHTPDKLC F I LDLMNGGDLH YHLS QHGVFS E KEMRFYA 
TE I ILGLEHMHNRFWYRDLKPAN I LLDEHGHARIS \DLQLACD 
FSKKKPHASVGTHGYMAPEVbQKGTAyDSSADWFSLGCMLFKLL 
RGHS P FRQKKT KDKHE I DRMTLTVNVELPDTFS PELKSLLEGLL 
QRDVS KRLGCHGGGSQE VKEHS FF KGVDWQHVYLQKYP P PLIPP 
RGEVNAADAFDIGSFDEEDTKGIKLLDCDQELYKNFPLVISERW 
QQEVTETVYEAVNADTDKIEARKRAKNKQLGHEEDYALGKDCIM 
HG YMLKLGNP F LTQWQRR Y FYL F PNRL EWRG EGES RQNLLTMEQ 
ILSVEETQIKDKKCILFRIKGGKQFVLQCESDPEFVQWKKELNE 
TFKEAQRLLRRAP KFLNKPRSGTVELP KPS LCHRNSNGL 


5955 


1726 


444 


KREREFRLAVCPLRYPSAYESSPGTELRECGLCRSGQBFADCRR 
PANRQDVLSGW INLPVLQLT1CDPLKTPGRLDHGTRTAF I HHREQ 
VW KR C I NI WRD VG LFG VLN E I AN S E EE VFE WVICTASGWALALCR 
WAS S LHGSLFPHLS LRSEDLI AE FAQVTNW S S CCLRVFAWHPHT 
NKFAVALLDDS VRVYNAS ST I VPS LKHRLQRNVASLAWKP LS AS 
VLAVACQSC I L I WTLDPTS LS TRP S SGCAQVLSHPGHTPVTS LA 
WAPS GGRLLSAS PVDAAI R VWDVS TETCVPL PWFRGGG VTNLLW 
SPDGS KI LATT PS AVFRVWEAQMWTCE RWPTLSGRCQTGCWS PD 
GSRLLFTVLGE PL I YSLS FPERCGEGKG\ ALEVQSQQRLWQ I CL 
RQQYRHQMVRRGLGERLTPWSGTPVGNVWLCL 


5956 


1 /UD 


J- J J 


GVG VRGARAMATVQEKAAALNLSALHS P AHR P PGFS VAQ KP FGA 
TYVWSSI INTLQTQVEVKKRRHRLKRHKDC FVGS EAVDVI FSHL 
IQNKYFGDVDI PRAKWRVCQALMDYKVFEAVPTKVFGKDKJCPT 
FEDSSCSLYRFTT I PNQDSQLGKENKLYSPARYADALFKSSDIR 
SASLEDLWENLSLKPANSPHVNISATLSPQVINEVWQEETIGRIi 
LQLVDLPLLDSLLKQQEAVPKI PQ PKRQSTMVNSSNYLDRG ILK 
AYSDSQEDEWLSAAIDCSEYLPDQMWEISRSFPEQPDRTDLVK 
E LLFDAI GRYY S S RE P LLNHLS DVHNG I A ELL VNG KTE IALEAT 
QLLLKLLDFQNREEFRRLLYFMAVAANPS E FKLQKES DNRM WK 
RI FS KAI VDNKNLS KGKTDLLVLFL\MDHQKDVFKI PGTL \HKI 
VSXVKXLMAIQNGRDPNRDAGYIYCQRIDQRDYSNNTEKTTKDE 
LLNLLKTLDEDS KLSAKEKKK\LLGQFYKCHPD I F I EHFGD 


5957 


1479 


451 


ELQVAVAMDTLDRWKPKTKRAKRFLEKREPKLNENI KNAML I K 
GGNANATVTKVLKDVYALKKPYGVLYKKKNITRPFEDQTSLEFF 
SKKSDCSLFMFGSHNKFO^PNNLVIGRMYDYHVLDMIELGIENFV 
SLKDIKNSKCPEGTKPMLIFAGDDFDVTEDYRRLKSLLIDFFRG 
PTVSNIRLAGLEYVLHFTALNGKIYFRS YKLLLKKSGCRTPRIE 
LSE^PSLDLVLRRTHLASDDLYKLSMKMPKALKPKKKKNISHD 
TFGTTYGRIHMQKQDLSKLQTRKM\KGLKKRPAERITEDHEKKS 
KRI KKKLMELSQPLLFHCVLLKR 1 1 KHQS I QSFL 


5958 


1 


3138 


AAALGMLLWFPACQAFNLDVEKLTVYSGPKGSYFGYAVDFHIPD 
ARTASVLVGAPKANTSQPDIVEGGAVYYCPWPAEGSAQCRQIPF 
DTTNNRK I RVNGTKE P I EFKSNQWPG\ ATVKA\HKGKS CGPVAP 
LLFTWRN FLKPT PE KGP VGTCYVAIQNFS AYAEFS PCGNSNADP 
EGQGYCQAG FS LDFY KNGDLI VGG PGS F YWQGQVITASVADI I A 
N YS FKD I LRKLAGE KQTEVAPAS YBDS YLGYS VAAGE FTGDSQQ 
ELVAGI PRGAQNFGYVS I INS YDMTFIQNFTGEQMAS YFGYTW 
VS DVNS DG LDDVL VGAPLFME RE F ESNP RE VGQI YL YLQ VS S LL 
FRDPO I LTGTETFGRFGSAMAHLGDLNQDGYND IAI GVP FAGKD 
QRGKVL I YNGNKDGLNTKPFPKFCQGVWASHAVPSGFGFTLRGD 
SDI DKNDYPDLI VGAFGTGKVAVYRARP WTVDAQLLLHPMI IN 
LENKTCQVPDSMTSAACFSLRVCASVTGQS IANTI VLMAEVQLD 
S LKQKGAI KRTLFLDNHQAHRVFPLVI KRQKSHQCQD F I VYLRD 
ETEFRJDKLSPINISLNYSLDESTFKEGLEVKPILNYYRENIVSE 
QAHILVDCGEDNLCVPDLKLSARPDKHOVI IGDENHLMLIINAR 
NEGEGAYEAELFVMI PEEADYVGI ERNNKGFRPLSCEYKMENVT 
RMWCDLGN PMVSGTNYSLGLRFAV PRLEKTNMS INFDLQ IRSS 
NKDNPDSNFVSLQIN3TAVAQVEIRGVSHPPQIVLPIHNWEPEE 
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(A-Alanine, OCysteine, D=Aspartic Acid, E= 
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L=Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X -Unknown, *"Stop 
Codon, /-possible nucleotide deletion, 
\.possible nucleotide insertion) 








EPHKEEBVGPLVEHIYELHNIGPSTISDTILEVGWPFSARDEFL 
LYIFHIQTLGPLQCQPNPNINPQDIKPAASPEDTPELSAFLRNS 
T I PHLVRKRD VHWEFHRQS PAK I LNCTN I ECLQ ISCAVGRLEG 
GES AVLKVR S RLWAHT FLQRKND P YALAS LVS FEVKKMP YTDQ P 
AKLPEGS IAIKTSVIWATPNVSFS IPLWVI ILAILLGLLVLAIL 
TLALW KCG F FDRAR P PQEDMTDREQLTND KTPE A 


5959 


1 


1166 


GTSGYAAQQLPSLLKEREFHLGTLNKVFASQWLNHRQWCGTKC 
NTLFWDVQTSQITKIPILKDREPGGVTQQGCGIHAIELNPSRT 
LLATGGDNPNS LAI YRLPTLDPVCVGDDGHKDWIFS I AW ISDTM 
AVSGS RDGS MG L WE VTDDVLTKS DARHNVS RVPVYAH I THKALK 
D I PKEDTN PDNCKVRALAFNNKNKELGAVSLDG YFHLWKAENTL 
S KLLS TKLP YCRENVCLAYGSEWS VYAVGSQAHVS FLDPRQPS Y 
NVKS VCSRERGSG IRS VS F YEH 1 1 TVGTGQGS LLFYD I RAQRFL 
EERLSACYGSKPRLAGENLKLTTG\KGWIiNHDETWRNYFSDIDF 
FPNAVYTHCYDSSGTKLFVAGGPLPSGLHGNYAGLWS 


5960 


2853 


870 


FVW S DGG P R PRRG PAVG AGAAH LSD P W AMT PGT ANRATNP LNKE 
LDWASINGFCEQLNEDFEGPPLATRLLAHKIQSPQEWEAIQALT 
VLETCMKSCGKRFHDEVGKFRFLNELI KWS PKYLGSRTSEKVK 
NK I LELLYS WTVGLPEE VKI AEAYQMLKKQG\ I VKSDPKLPDDT 
TFPLPPPRPKmriFEDEEKSKMI^LLKSSHPEDLRAANKLIKE 
MVQ EDQ KRME KI S KRVNA I EEVNNNVKLLTEMVMSHS QGGAAAG 
S S EDL \MKEL\ YQRCBRMRPTLF PTGRVDTEDND\EALAEI LQA 
NDNLTQ V INLY KQLVRGE E VNGDATAG S I PGSTSALLDLSGLDL 
P P AGTT YP AMP TR PGEQ AS PEQ P S AS VS LLDD E LMSLGLSD PT P 
PSGPSLDGTGWNS FQS S DATEP PAPALAQAPSMESRP PAQTSLP 
ASSGLDDLDLLGKTLLQQSLPPESQQVRWEKQQPTPRLTLRDLQ 
NKSSSCSSPSSSATSLLHTVSPEPPRPPQQPVPTELSLASITVP 
LESIKPSNILPVTVYDQHGFRILFHFARDPLPGRSDVLWWSM 
LSTAPQPIRNIVFQSAVPKYMKVKLQPPSGTELPAFNPIVHPSA 
ITQVLLLANPQKEKVRLRYKLTFTMGDQTYNEMGDVDQFPPPET 
WGSL 


5961 


198 


3147 


SGEPRPEPGNMATCIGEKI EDFKVGNLLGKGSFAGVYRAES IHT 
GLEVAIKMIDKKAMYKAGMVQRVQNEVKIHCQLKHPSILELYNY 
FEDSNYVYLVLEMCHNGEMNRYLKNRVKPFSENEARHFMHQI IT 
GMLYLHSHGILHRDLTLSNLLLTRNMNIKIADFGLATQLKMPHE 
KHYTLCX3TPNYI S PEIATRS AHGLESDVWS LGCMFYTLL IGRP P 
FDTD TVKNTLNKWLAD YEM PTFLS IEAXDLIHQLLRRNPADRL 
SLSSVLDHPFMSRNSSTKSKDLGTVEDSIDSGHATISTAITASS 
STSISGSLFDKRRLLIGQPLPNKMTVFPKNKSSTDFSSSGDGNS 
FYTQWGNQETSNSGRGRVIQDAEERPHSRYLRRAYSSDRSGTSN 
SQSQAKTYTMERCHSAEMLSVSKRSGGGENEBRYSPTDNNANIF 
NFFKEKTS SSSGS FERPDNNQALSNHLCPGKTP FPFADPTPQTE 
TVQQWFGNIjQINAHLRKTTEYDSISPNRDFQGHPDLQKDTSKNA 
WTDTKVKKNSDASDNAHS VKQQNTMKYMTALHS KPE I IQQECVF 
GSDPLSEQSKTRGMEPPWGYQNRTLRSITSPLVAHRLKPIRQKT 
KKAWSILDSEEVCVELVKEYASQEYVKEVLQISSDGNTITIYY 
PNGG\RGFPLA\DRPPSPT\DNISR\YSF\DNLPEKYWRKYQYA 
SRFVQLWSKSPKITYFTRYAKCILMENSPGADFEVWFYDGVKI 
HICTEDFIQVIEKTGKSYTLKSESEVNSLKEEIKMYMDHANEGHR 
ICLALESIISEEERKTRSAPFFPIIIGRKPGSTSSPKALSPPPS 
VDSNYPTRDRAS FNRMVMHSAASPTQAP ILNPSMVTNEGLGLTT 
TASGTD I S SNSLKDCLPKSAQLLKSVFVKNVGWATQ\LTSGAVW 
VQFNDGSQLWQAGVSSISYTSPNGQ\TTR\YGENEKLPDYIKQ 
KLQCLSS ILLMFSNPTPNFH 


5962 


20 


2447 


RVCSS S ASTASQAVMADAWEE IRRLAADFQRAQ FAEATQRLS ER 
NCIEIVNKLIAQKQLEWHTLDGKEYITPAQISKEMRDELHVRG 
GRVNIVDLQQVINVDLIHIENRIGDIIKSEKHVQLVLGQLIDEN 
YLDRLAEEVNDKLQESGQVTI SELCKTYDLPGNFLTQALTQRLG 
R I ISGH I DLDNRGVI FTEAF VARHKAR I RGLFSAI TRPTAVNS L 
I S KYG FQEQLL YS VLEELVNS GRLRGT WGGRQD KAVFVPD I YS 
RTQ STWVDS FFRQNG YLEFDALS RLG I PDAVSY I KKR YKTTQLL 
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amino acid 
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nucleotide 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H-Histidine, I*Isoleucine , K=Lysine, 
L=Leucine, M^Methionine , N^Asparagine , 
P<=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








FLKAACVGQGLVDQVEASVEEAI SSGTWVDI APLLPTSLSVEDA 

AILIiOOVMRAFSKOASTVVF9nTVVV<iFKF\ TNnrTFT.FRFT JVTU 

QKAEKEMKNNPVHLITEEDLKQISTLESVSTSKKDKKDERRRKA 
TEGSGSMRGGGGGNAREYKIKKVKKKGRKDDDSDDESQSSKTGK 
KKPEISFMFQDEIEDPIiRKHIQDAPEEFISELAEYLIKPLNKTY 
LEWRS VFMS STTSASGTGRKRT I KDLQEE VSNLYNN I RLFEKG 
MKFFADDTQAALTKHLLKSVCTDITNLIFNFLASDLMMAVDDPA 
AI TS E IRKKI LS KLS EETKVALTKLHNSLNEKS IEDFIS CLDSA 
AEACD IMVKRGDKKRERQ I LFQHRQALAEQLKVTEDPAL I LHLT 
SVLLFQFSTHSMLHAPGRCVPQI IAFLNSKI PEDQHALLVKYQG 
TiV\nfnTiV^n^TTKTf2rw^nvDT.MMTrT.nvTrnwn\r&c , PTDV''c , T act go 

Jj V V zvyJJ v oyo rvJV JL VjyuiJ I Jr JoiVlN CiLiLJl^tLi\Jt!iLJ V Aj 1 1 KJxJL. Lt^tL Lioo 

SIKDLVLKSRKSSVTEE 


5963 


62 


1130 


PWNPQDFPGNRGLMG\QKGEIGPP\GQQGKKGAPGMP\GLMGSN 
GS PGQ PGTPGS KGSKGE PG I QGMPGAS GLKGEPGATGS PGE PG Y 
MGLPG I QGKKGDKGNQGEKG I QGQKGENGRQG I PGQQG I QGHHG 
AKGERGEKGEPGVRGAIGSKGESGVDGLMGPAGPKGQPGDPGPQ 
GP PGLDGKPGRE FSEQF I RQ VCTD VI RAQLP VLLQSGR I RNCDH 
CLSQHGSPGIPGPPGPIGPEGPRGLPGLPGRDGVPGLVGVPGRP 
G VRGLKG LPGRNGE KGS QG FG YPGE QGP PGP PGPEGPP G I S KEG 
PPGDPGLPGKDGDHGKPGIQGQPGPPGICDPSLCFSVIARRDPF 
RKGPNY 


5964 


3 


2147 


SCRTRGRLSPLQPREAGSSRGSRARSEPPRPGGMEEACQVQTTK 
RGDPHELRNIFLQYASTEVDGERYMTPEDFVQRYLGLYNDPNSN 
PKIVQLLAGVADQTKDGLISYQEFLAFESVLCAPDSMFIVAFQL 
FDKSGNGE VTFENVKE I FGQTI IHHH I P FNWDCE FIRLHFGHNR 
KKHLNYTEFTQFLQELQLEHARQAFALKDKSKSGMISGLDFSDI 
MVT I RSHMLTP FVEENLVS AAGGS I SHQVS FS YFNAFNS LLNNM 

FT ATV)V T VQTT Af2TD VT^ATrirTVTrPirancii TDV^lOXTTJT DTFiTT vr\ 
tuvKFJ I o 1 ij/Wjr 1 KiMJ/\C< V 1 FJitttC Ay^Al K I I rLitli IL/LLi 1 \J 

LADL YNAS GRLTLAD I ER I AP LAEGALP YNLAE LQRQQS PGLGR 
PIWLQ I AES AYRFTLGSVAGAVGATAVYP I DLVKTRMQNQRGSG 
S WGELM YKNSFDCFKKVLR YEGFFGL YRGL I PQLIGVAPE KA I 
KLTVKDFVRDKFTRRDGS VPLP AEVLAGGCAGGS Q VI FTNPLE I 
VKIRLQ VAGE ITTGPR VSALNVLRCLG I FGLYKGAKACFLRD I P 
FSAIYFPVYAHCKLLLADENGHVGGLNLLAAGAMAG\VPAASLV 
TPADV I KTRLQVAARAGQTTYSGVIDCFRKIL\REEGPSAFWKG 
TAAR VFRS S PQFG \ VTL VT YELLQRG FY I D FGGL KPAGSE PTP K 
SRIADLPPANPDHIGGYRLATATFAGIENKFGLYLPKFKSPSVA 
WQPKAAVAATQ 


5965 


1 


1498 


MVTWLYRFLPTSNMAAKIiRSLLPPDLRLQFWLHARLQKCFLSRG 
CGS YCAG AKAS P I . PG KM AMGT.MCGPJ? EL.T.P'LT /"i o. r!P P VHQVfin D 
S QWLGKPLTTRLLFPAA PCCCR PHYL FLAAS GPRS LS TS AI S FA 
EVQVQAP PWAATPS PTAVP EVAS GETADWQTAAEQS FAELGL 
G S YTPVG L I QNLLE FMHVDLGL P W WGAI AACTVFAR CL I FPL I V 
TGQREAARIHNHLPEIQKFSSRIREAKIiAGDHIEYYKASSEMAL 
YQKKHG I KLY KPL I LP VTQAP I F I S FF I ALR EMANL P VPS LQTG 
GLWWFQDLTVSDPIYILPLAVTATMWAVLELGAETGVQSSDLQW 
MRNVIRMMPL I TLP I TMHFPTAVFMYWLS SNL FSLVQVSCLR I P 
AVRTVLK I P Q R WHDLDKL P P REG FLES F KKG W KNAEMTRQLRE 
REQRMRKQLELAARGPLRQTFTHNPLLQPGXDNPPNIPSS\SSS 
SSKPKSKYPWHDTLG 


5966 


102 


1925 


RS KQ VMARLTKRRQADTKAI QHLWAAI E 1 1 RNQKQ I AN I DR ITK 
YMSR VHGMH P KE TTRQLSLAVKDG L I VETLT VG CKGS KAG I EQE 
GYWLPGDEIDWETENHDWYCFECHLPGEVLICDLCFRVYHSKCL 
SDEFRLRDSSSPWQCPVCRSIKKKNTNfKQEMGTYLRFIVSRMKE 
RAIDIjNKKGKIJNKJIPMYPJ^LVHSAVDVPTIQEKVNEGKYRSYEE 
F KADAQLLLHNT VI F YGADS EQ AD I ARML YKDTCHE L \ D E LQLC 
KNCFYLANAR PDNWFC YPC I PNHELDWAKMKGFGFWPAKVMQKE 
DNQVDVRFFGHHHQRAWIPSENIQDITVNIHRLHVKRSMGWKKA 
CDELELHQRFLREGRFWK5KNEDRGEEEAESSISSTSNEQLKVT 
CEPRAKKGRRNQSVEPKKEEPEPETEAVSSSQEIPTMPQPIEKV 
SVS TOTKKLSASSPRMLHRSTQTTNTCVCQSMCHDKYTKI FNDF 
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Predicted end 
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location 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide "I 
{A*Alanine, C-Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F-Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L«Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R*=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y -Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








KDRMKSDHKRETERVVREALEKLJ^SEMEEEKRQAVNKAVANMG^ 
EMDRKCKQVKE KCKEE FVEE I KKLATQHKQL I SQTKKKQWC YNC 
EEEAM YHCCWNTS YCS I KCQQEHWHAEHKRTCRRKR 


5967 


102 


1925 


RSKQVMARLTKRRQADTKAIQHLWAAIEllRNQKQIANIDRITK 
YMSR\^G^4HPKETTRQLSLAVKDGLIVETL'rVGCKGSKAGIEQE 
G YWLPGDE I DWETENHDWYCFECHLPGE VL I CDLCFRVYHSKCL 
S DE FRLRD S S S P WQ C P VCRS I KKKNTN KQEMGTYLR F I VS RM KE 
RAI D LNKKG KDNKHP MYRRL VHS AVD VP T I QE KVNEG K YRS YE E 
FKADAQLLLHNTVI FYGADSEQAD IARML YKDTCHEL\ DELQLC 
KNCFYLANARPDNWFCYPCIPNHELDWAKMKGFGFWPAXVMQKE 
DNQVD VRFFGHHHQRAW I PSENIQD I TVN I HRLHVKRSMG WKKA 
CDELELHQRFLREGRFWKSKNEDRGEEEAESSISSTSNEQLKVT 
QE P RAKKGRRNQS VE PKKEEPE PETEAVS S SQE I PTMPQP I EKV 
SVSTQTKKLSASSPRMLHRSTQTTNDGVCQSMCHDKYTKIFNDF 
KDRNKSDHKRETERWREALEKLRSEMEEEKRQAVNKAVANMCX5 
EMDRKCKQVKEKCKEEFVEEIKKLATQHKQLISQTKKKQWCYNC 
EEEAM YHCCWNTS YCS I KCQQEHWHAEHKRTCRRKR 


5968 


81 


1288 


VRFPRRGGAP P TVLTPGRQQG VFLGPQRPGSE PD I PARGQPHPP 
RPVGV STS AQ AQ VQ P PAMHRRRIiALGLGFCLLAGTSLS VLW VYL 
ENWLPVS YVP YYLPCPE I FNMKLHYKREKPLQPWWSQ YPQPKL 
LEHR PTQLLTLTP WLAPI VS EGT FNPELLQHI YQPLNLT IGVTV 
FAVGN / HFLES AEE FFMRGYRVHYYI FTDNPAAVPGVPLGPHEL 
LSSIPI QGHSKWEETSMRRMET ISQH I AKRAHREVD YL FCLDVD 
MVFRNPWGPETLGDLVAAIHPSYYAVPRQQFPYERRRVSTAFVA 
DSEGDFYYGGAVFGGQVARVYEFTRGCHMAILADKANGIMAAWR 
EESHLNRHFISNKPSKVLSPEYLWDDRKPQPPSLKLIRFSTLDK 
DISCLRS 


5969 


1126 


503 


DVGFNIKRKRCDLDVFLESPRKPSGRRDRAPEKQRRIAANKCLC 
TGVREGEP PS / TTSQKVKEAGRDFTYLI VVLFG I S ITGGLF YT I 
FKE LFS S S S P SKI YGRALE KCR SHPE V I G VFGES VKG YGEVTRR 
GRRQHVRFTEYVKDGLKHTCVK7YIEGS E PGKQGTVYAQVKENP 
GSGEYDFRYIFVEIESYPRRTIIIEDNRSQDD 


5970 


316 


4712 


SQDNIGHRLLQKHGWKLGQGLGKSLQGRTDPI P I WKYDVMGMG 
RMEMELDYAEDATERRRVLEVEKEDTEELRQKYKDYVDKEKAIA 
KALEDLRANFYCELCDKQYQKHQEFDNHINSYDHAHKQRLKDLK 
QRE FARNVS S RSRKDE KKQEKALRRLHE LAEQRKQAE CAPGSGP 
M FKPTTVAVDEEGGEDDKDES ATNSGTGATAS CGLGS EFSTDKG 
GP FTAVQI TNTTGLAQAPGLAS QG I S FG I KNNLGTPLQKLGVS F 
SFAKXAPVKLESIASVFKDHAEEGTSEDGTKPDEKSSDQGLQKV 
GDSDGSSNLDGKKEDEDPQDGGSLASTLSKLKRMKREEGAGATE 
PEYYHYIPPAHCKVKPNFPFLLFMRASEQMDGDNTTHPKNAPES 
KKGSSPKPKSCIKAAASQGAEKTVSEVSEQPKETSMTEPSEPGS 
KAEAKKALGGDVSDQSLESHSQKVSETQMCESNSSKETSLiATPA 
GKESQEGPKHPTGPFFPVLSKDESTALQWPSELLIFTKAEPSIS 
YSCNPLYFDFKLSRNKDARTKGTEKPKDIGSSSKDHLQGLDPGE 
PNKSKEVGGEKIVRSSGGRMDAPASGSACSGLNKQEPGGSHGSE 
TEDTGRSLPSKKERSGKSHRHKKKKKHKKSSKHKRKHKADTEEK 
SSKAESGEKSKKRKKRKRKKNKSSAPADSERGPKPEPPGSGSPA 
PPRRRRRAQDDSQRRSLPAEEGSSGKKDEGGGGSSSQDHGGRKH 
KGELPPSSCQRRAGTKRSSRSSHRSQPSSGDEDSDDASSHRLHQ 
KSPSQYSEEEEEEDSGSEHSRSRSRSGRRHSSHRSSRRSYSSSS 
DASSDQSCYSRQRSYSDDSYSDYSDRSRRHSKRSHDSDDSDYAS 
SKHRSKRHKYSSSDDDYSIiSCSQSRSRSRSHTRERSRSRGRSRS 
S SCSRS RS KRRS RS TTAHS WQRSRS YSRDRS RSTRS PSQRSGS R 
KRSWGHESPEERHSGRRDFIRSKIYRSQSPHYFRSGRGEGPGKK 
DDGRGDDS KATG P P SQNSN IGTGRGS EGDCS PEDKNS VTAKLLL 
E KIQSRKVERKPS VS EEVQATPNKAGP KLKD PPQG YFGPKLP PS 
LGNKPVLPLIGKLPATRKPNKKCEESGLERGEEQEQSETEEGPP 
G S S DAL FGHQF P \ S E ETTG PLLD PP PEES KSGE VT ADHPVAP LG 
PPAHFDCYLGDPTISHNYLPDPSDGNTLESLDSSSQPGPVESSL 
LP IAPDLEHFPS YAP P SGDPS I ESTDGAEDA\SLAPLESQPI TF 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F* Phenyl alanine, G=Glycine, 
H-Histidine, I-Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, +-Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TPEEMEXYSKLQQAAQQHI QQQLLAJCQ VKAFPASAALAPATPAL 
QPIHIQQPATASATSITTVQHAILQHHAAAAAAAIGIHPHPHPQ 
PLAQVHH I PQPHLTP I SLSHLTHS 1 1 PGHPATFLASHP IHI IPA 

SAIHPGPFTFHPVPHAALYPTLLAPRPAAAAATALHLHPLLHPI 
FSGQDLQHPPSHGT 


5971 


53 


2149 


S FLY FVGVDMDNP I GNWDGRFDG VQLCS FACVEST I LIiHINDI I 
PESVTQERRPPKLAFMSRGVGDKGSSSHNKPKATGSTSDPGNRN 
RS E LFYTLNGSS VDS Q PQS KS KNTW Y I DE VAEDPAKS LTE I S TD 
FDRSSPPLQPPPVNSLTTENRFHSLPFSLTKMPNTNGSIGHSPL 
SLSAQSVMEELNTAPVQESPPLAMPPGNSHGLEVGSLAEVKENP 
P F YGVI RW I GQPPGLNEVLAGLELEDE CAG \ CTDGTF/REGTR Y 
FT CALKKAL F VKL KS CRPDS R FAS LQ P VSNQ I ERCNS LAI WEAY 
LSEVVEENTPTQKWEKEGLEIMIGNKKKGIQGHYNSCYIoDSTLF 
CLFAFS S VLDT VLLR P KEKNDVE Y YS E TOE LLRTE I VN P LR I YG 
YVCATKIMKLRKILEKVEAASGFTSEEKDPEEFLNILFHHIIiRV 
EPLLKIRSAGQKVQDCYFYQIFMEKNEFCVGVPTIQQLLEWSFIN 
SNLKFAEAPS CLI IQMPRFGKDFKLFKKI FPSLELNITDLLEDT 
PRQCRICGGLAKyECRECYDDPDISAGKIKQFCKTCNTQVHLHP 
KRLNHKYNPVSLPKDLPDWDWRHGCIPCQNMELFAVLCIETSHY 
VAFVKYGKDDSAWLPFDSMADRDGGQNGFNIPQVTPCPEVGEYI. 
KMS LEDLHS LDS RR I QGCARRLLCDAI YVPCTQS PTMS L YK 


5972 


440 


1761 


ILLAGSPSPRDQCSQRQSSGGDKEIjVTRGCTFSTAWSPSAMTQ " 

E P FREE LA YDRM PTLERGRQD PAS YAPDAKPS D LQLS KRL P P C F 

S HKT W VFS VLMG S CLL VTS G FS L YLGNVF P AEMD YLRCAAG S C I 

PSAIVSFTVSRRNANVI PNFQILFVSTFAVTTTCLIWFGCKLVL 

NPSAININFNLILLLLLELLMAATVIIAARSSEEDCKKKKGSMS 

DSANILDEVPFPARVLKSYSWEVIAGISAVLGGIIALNVDDSV 

SGPHLS VTFFWI L VACFPSAIASHVAAECPNKCL VBVL IAIS SL 

TSPLLFTASGYLSFSIMRIVEMFKDYPPAIKPSYDVLLLLLLLV 

LLLQA/GPQHGHRHPVRALQGQCKAAGCILGHPERPAGAPGWGG 

GQEPPEGVRQGESLESRRGANGPVTPRRGNRVAAPSLAPGMETH 

NP 


5973 


65 


• 2007 


NGDGKDLiFGHIWAWRSNGIISNFRRSPHAGMAEDEPDAKSPKTG 
GRAP PGGAE AGEPTTLLQRLRGT I S KAVQNKVEG I LQD VQKFSD 
NDKLYLYLQLPSGPTTGDKSSEPSTLSNEEYMYAYRWIRNHLEE 
HTDTCLPKQS VYDAYRKYCESLACCRPLSTANFGKI IRE I FPDI 
KARRLGGRGQSKYCYSGIRRKTLVSMPPLPGLDLKGSESPEMGP 
EVTPAPRDELVEAACALTCDWAERILKRSFSSIVEVARFLLQQH 
LISARSAHAHVLKAMGLAEEDEHAPRERSSKPKNGLENPEGGAH 
KKPERLAQP p KDLEARTGAG P LARGE R KKS WES S APGANNLQ V 
NAIjVARLPLLLPRAPRSLI PPI PVSPPI IiAPRLSSGALKVATLP 
LSSRAGAPPAAVPIINMILPTVPALPGPGPGPGRAPPGGLTQPR 
GTENREVG I GGDQGPHDKG VKRTAEVP VS EASGQAP PAKAAKQD 
IEDTASDAKRKRGRPLKKSGGSGERNSTPLKSAAAMESAQSSRIi 
P WETWGS GG EGNS AGGAER PG PMGEAE KGAVLAQG \QGDGTVS K 
GGRGPGSQHTKEAEDKIPLVPSKVSVIKGSRSQKEAFPLAKGEV 
DTAPQGNKDLKEHVLQS S LSQEHKDP KATP P 


5974 


4293 


2200 


LGLQMHTTSGRIHQAM\rrSLNEDNESVTVEWIENGDTKGK\BID~ 

LESIFSLNP\DL\VPDGEIEPSP\ETPPPPASSAKVNKIVKNRR 

TV\ASIKNBPPS\RDNRWGSARARPSQFPEQFSSAQQNGSV\S 

DISPVQAAKKEFGPPSRRKSNCVKEVEKLQEKREKRRLQQQELR 

isiUtAUUVUATNPN X EI MCMIRDFRGSLDYRPLTTADP IDEHR I C 

VCVRKRPLKKKETQMKDLDVITIPSKDVVMVHEPKQKVDLTRYL 

ENQTFRFDYAFDDSAPNEMVYRFTARPLVETIFERGMATCFAYG 

QTGSGKTHTMGGDFSGXNQDCSKGIYAliAARDVFLMLKKPNYKK 

LELQVYATFFE I YSGKVFDLLNRKTKLRVLEDGKQQVQ WGLQE 

RE VKCVED VLKL I D I GNS CRTS G QTS ANAHS S RS HAVFQ 1 1 LRR 

KG KLHGKFSL I DLAGNERGADTS S ADRQTRLEGAE INKS LLALK 

E C I RALGRNKPHTPFRAS KLTQVLRDS F I GENS RTCM I AT I S PG 

^CENTIiNTLRYANRVKELTVDPTAAGDVRPIMHHPPNQI\DD 

LETQWGVGSSPQRDDLKLLCEQNEEEVSPQLFTFHEAVSQMVEM 



416 



WO 01/53312 



PCT7US00/34263 



SEQ 
ID 
NO: 
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Predicted end 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D»Aspartic Acid, E= 
Glutamic Acid, F=»Phenylalanine, G=Glycine, 
H-Histidine, I*Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=»A3paraglne , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








EEQWEDHRAVFQESIRWLEDEKALLEMTEEVDYDVDSYATQLE 
AI LEQK I D I LTELRDKVKS FRAALQEEEQASKQ INPKR PRAL 


5975 


4293 


2200 


LGLQMHTTSGR I HQAMVT S LNE DNES VTVE WI ENGDTKGK \ E I D 
LESIFSLNP\DL\VPDGEIEPSP\ETPPPPASSAKVNXIVKNRR 
TV\ASIKNX>PPS\RDNRWGSARARPSQFPEQFSSAQQNGSV\S 
DISPVQAAKKEFGPPSRRKSNCVKEVEKLQEKREKRRIjQQQEIjR 
EKRAQDVDATNPNYE I MCM I RDFRGSLD YRPLTTADP I DEHRI C 
VCVRKRPLNKKETQMKDLDVITIPSKDVVMVHEPKQKVDLTRYL 
ENQTFRFDYAFDDSAPNEMVYRFTARPLVETIFERGMATCFAYG 
QTGSGKTHTMGGDFSGKNQDCSKGIYAIiAARDVFLMLKKPNYKK 
LELQVYATFFEIYSGKVFDLIiNRKTKLRVLEDGKQQVQWGLQE 
RE VKCVEDVLKL I D IGNSCRTSGQTSANAHS S RSHAVFQ I I LRR 
KGKLHGKFSL I DLAGNERGADTS SADRQTRLEGAE INKSLLALK 
ECIRALGRNKPHTPFRASKLTQVLRDSFIGENSRTCMIATISPG 
MAS CENTLNTLRYANRVKELTVDPTAAGDVRP I MHHPPNQ I \ DD 
LETQWGVGSSPQRDDLKLLCEQNEEEVSPQLFTFHEAVSQMVEM 
EEQWEDHRAVFQESIRWLEDEKALLEMTEEVDYDVDSYATQLE 
AILEQKIDILTELRDKVKSFRAALQEEEQASKQINPKRPRAL 


5976 


20 


2949 


VHHLHLTRVS VWNLD IILR IAQQMGIKTLNLVLG \ LKRA\ LE F 
PEVSWMEVKDPNMKGAMLTNTGKYAIPTIDA\EAYAIGKKEKPP 
FLPEEPSSSSEEDDPIPDELLCLICKDIMTDAWIPCCGNSYCD 
ECIRTALLESDEHTCPTCHQNDVSPDALIANKFLRQAVNNFKNE 
TGYTKRLRKQLPSPPPP I PPPRPLIQRNLQPLMRSP ISRQQDPL 
MIPVTSSSTHPAPSISSLTSNQSSLAPPVSGNPSSAPAPVPDIT 
ATVS I S VHS E KSDG PFRDS DNK I LPAAALASEHS KGTS S I AITA 
LMEEKGYQVPVLGTPSLLGQSLLHGQLIPTTGPVRINTARPGGG 
R PGWEHSN KLG YL VS P P QQ I RRGERS CYRS I NRGRHH S E RS QRT 
QGPSLPATPVFVPVPPPPLYPPPPHTLPLPPGVPPPQFSPQFPP 
GQP\PPAGYSVPPPGFPPAPANLSTPWVSSGVQTAHSNTIPTTQ 
APPLS REE F YRE Q RRL KE E E KKKS KLDEFTND FAKELME YKKI Q 
KERRRSFSRSKSPYSGSSYSRSSYTYSKSRSGSTRSRSYSRSFS 
RSHSRSYSRSPPYPRRGRGKSRNYRSRSRSHGYHRSRSRSPPYR 
R YHSRS RS PQAFRGQS PNKRNVP QGETERE Y FNR YREVP P P YDM 
KAYYGRSVDFRDPFEKERYREWERKYREWYEKYYKGYAAGAQPR 
PSANRENFS P ERFLPLN I RNS P FTRGRREDYVGGQSHRS RN IGS 
NYPEKLSARDGHNQKDNTKSIG2KESEKAPGIX3KGNKHKKHRKRR 
KGE ES EG FLNPE LLETS R KS RE P TG VEENKTDS LF VLP S RD DAT 
PVRDEPMDAESITFKSVSEKDKRERDKPKAKGDKTKRKNDGSAV 
SKKENIVKPAKGPQEKVDG\DV3y3LLDLNL\QLKKPKEETPKDL 
TILKHHLPLRRMKKSL \ EP P \ EKLTLNQQK\TPRNKTSQRGKS E 
EGLFQRCQIRKANN 


5977 


1363 


1336 


FLEDRGQ VbSHFQ CLS LHS INHILHPGAGVAAG P ATGW / RE YLT 
PVLKESKFKETGVITPEEFVAAGDHLVHHCPTWQWATGEELKVK 
AYLPTGKQFLVTKNVPCYKRCKQMEYSDELEAIIEEDDGDGGWV 
DTYHNTGITGITEAVKEITLENKDNIRLQDCSALCEEEEDEDEG 
EAADMEEYEESGLLETDEATLDTRKIVEACKAKTDAGGEDAILQ 
TRTYD L Y I TYD K Y YQTPRL W LFGYDEQRQ PLTVEHM YED I S QDH 
VKKTVTIENHPHLPPPPMCSVHPCRHAEVMKKIIETVAEGGGEL 
GVHMYLLI FLKFVQAVI PT I E YDYTRHFTM 


5978 


160 


3213 


RDGARRWGGCQSPLTWAPGFYRRFDLATSGRRLRGQTAEPAGRQ 
RPRREPEAMDEQSVESIABVFRCFICMEKLRDARLCPHCSKLCC 
FSCIRRWLTEQRAQCPHCRAPLQLRELVNCRWAEEVTQQLDTLQ 
L CSLTKHE ENE KDKCENHHE KLS VFCWTCKKC I CHQCALWGGMK 
GGHTFKPLAE I YEQHVTKVNEEVAKLRRRLMELI SLVQEVERNV 
EAVRNAKDERVRE I RNAVEMMIARLDTQLKNKLITLMGQKTSLT 
QETELLESLLQEVEHQLRSCSKSELISKSSEILMMFQQVHRKPM 
ASFVTTPVPPDFTSELVPSYDSATFVLENFSTLRQRADPVYSPP 
LQVSGLCWRLKVYPDGNGVVRGYYLSVFLELSAGLPETSKYEYR 
VEMVHQSCNDPTKNI IREFASDFEVGECWGYNRFFRLDLLANEG 
YLNP^QNDTVI LRFQ VRS PT F FQKSRDQHWYITQLEAAQTS YIQQ 
INNLKERLTI ELSRTQKSRDLS PPDNHLS PQNDDALETRAKKSA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine , G-Glycine, 
H=Histidine, I-Ieoleucine, K-Lysine, 
L-Leucine, M-Methionine, N=Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine ( T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 








CSDMLLER\GPYSAS\VREAKEDEEDEEKIQNEDYHHELSDGDL 
DLDL VYEDE VNOLDGS S S S AS S TAT <3 MT R pwri t npvTM cnrMm r 

EYKNMELEEGEIiMBDAAAAGPAGSSHGYVGSSSRISRRTHLCSA 
ATSSLLD I DPL I LIHLLDLKDRSS I ENLWGLQPRPPAS LLQPTA 
SYSRKDKDQRKQQAMWRVPSDLKMLKRLKTO^IAEVRCMKTDVKN 
TLS E I KSS S AAS GDMQTS L FS ADQAALAACGTENSGRLQDLGME 
LLAKSSVANCYIRNSTNKKSNSPKPARSSVAGSLSLRRAVDPGE 
NSRSKGDCQTLSEGSPGSSQSGSRHSSPRALIHGSIGDILPKTE 
DRQCKALDSDAVWAVFSGLPAVEKRRKMVTLGANAKGGHLEGL 
QMTDLENNSETGELQPVLPEGASAAPEEGMSSDSDIECDTENEE 
QEEHTS VGGFHDS FMVMTQPPDEDTHSSFPDGEQIGPEDLS FNT 
DENSGR 


5979 


212 


3665 


LPDMTKYLWLKLLAFGFAFLDTEVFVTGQSPTPSPTDAYLNASE' 
TTTLS PSGSAVI STTTIATTPSKPTCDEKYANITVDYLYNKETK 
LFTAJCLNWENVECGNNTCTNNEV^HNLTECKNASVSISHNSCTA 
PDKTLILDVPPGVEKVPVHCCS\QVEQPDSTIWLKWKNIETSTC 
DTQNI TYR FQ CGNM I FDNKE I KL BNLE PEHE YKCD SE I LYNS HK 
FTNASKI I KTDFGSPGEPQI I FCRSEAAHQGVI TWNPPQRSFHN 
FTLCYIKETEKDCLNLDKNLIKYDLQNLKPYTKYVLSLHAYIIA 
KVQRJNfGSAAMCHFTTKSAPPSQVWNMTVSMTSDNSMHVKCRPPR 
DRNGPHERYHLEVEAGNTLVRNESHKNCDFRVKDLQYSTDYTFK 
AYFHNGDYPGEPFILHHSTSYNSKALIAFLAFLIIVTSIALLW 
LYKI YDLHKKRS CNLDEQQELVERDDEKQLMNVE P IHAD I LLET 
YKRKIADEGRLFLAEFQS I PRVFS KFP I KEARKP FNQNKNR YVD 
ILPYDYNRVELSEINGDAGSNYINASYIDGFKEPRKYIAAQGPR 
DETVDDFWRMI WEQKATVI VMVTRCEEGNRNKCAE YWPSMEEGT 
RAFGECCCKDLTKHKRCP\DYIIQKLNIVNKKEKATGREVTHIQ 
FTSWPDHGVPEDPHLLLKLRRRVNAFSNFFSGPIWHCSAGVGR 

TGTYIGIDAMTiFf5TiFaRKJTVn\7VrZV\nyvT DDnOPT lunrni ?r?* rwr-r 
■■■ 1 * UA it-id\j udndrtr^. >u v luivv 1 1 k k 1 1 ?r t 1 1 Pn vU VhiAui I 

L.IHQALVEYNQFGETEVNLSELHPYLHNMKKRDPPSEPSPLEAE 
FQRLPS YRSWRTQHI GNQE \ENKSKNRNSNVI PYD YNRVPLKHE 
LEMSKESEHDSDESSDDDSDSEEPSKYINASFIMSYWKP\EVMI 
AAQG P LKET I GDFWQM I FQRKVKV I VMLTELKHGDQE I CAQ YWG 
EGKQTY GD I EVDLKDTDKS S TYTLRVFELRHSKRKDSRrVYQ YQ 
YTNWSVEQLPAEPKELISMIQWKQKLPQKNSSEGNKHHKSTPL 
LIHCRDGSQQTGIFCALLNLLESAETEEVVDIFQVVKALRKARP 
GMVSTFEQYQFLYDVIASTYPAQNGQVKKNNHQEDKIEFDNEVD 
KVKQDANCVNPLGAPEKLPEAKEQAEGSEPTSGTEGPEHSVNGP 
ASPALNQGS 


5980 


3 


2363 


DAWGCJCLRRLRFTYGTQTRVSLALPGQYELVHTLVAHQGNWETI 
PEEDLEVQENNEDAAHDLTELEVTMHHALLQEVDVVVAPCXJGLR 
PTVDVLGDLVNDFLPVITYALHKDELSERDEQELQE I RKYFSFP 
VFFFKVPKLGSEIIDSSTRRMESERSPLYRQLIDLGYLSSSHWN 
CGAPGQDTKAQSMLVEQSEKLRHLSTFSHQVLQTRLVDAAKALN 
LVHCHCLDIFINQAFDMQRDLQITPKRLEYTRKKENELYESLMN 
IANRKQEEMKDMIVETLNTMKEELtiDDATNMEFKDVIVPENGEP 
VGTRE I KCC I RQ I QEL I I S RLNOAVANKL I S <? VD YT .P V <5 FvnTT 

ERCLQSLEKSQDVSVHITSNYLKQILNAAYHVEVTFHSGSSVTR 
MLWEQ I KQ 1 1 QR I TW VS PPAI TLEWKRKVAQEAI ES LS AS KLAK 

SrCSQFRTRLNSSHEAFAASLRQLEAGHSGRLEKTEDLWLRVRK 
DHAPRLARLS LE S RS LQD VLLHRKP KLGQ E LGRGQ YG VVYLCDN 
WGGHFPCALKSWPPDEKHWNDLALEFHYMRSLPKHERLVDLHG 
SVIDYNYGGGSSIAVLLIMERLHRDLYTGLKAGLTLETRLQIAL 
DWEGIRFLHSQGLVHRDrKLKNVLLDKQNRAXITDLGFCKPEA 
MMS GS I VG TP I HKA PEL FTGKYDNSVDVYAFG I LFWYICSGSVK 
LPEAFERCAS KDHLWNNVRRGARPERLP VFDEECWQLMEACWDG 
D P L KR PLLG I VQP MLQG I MNRLCKS \ NSEQ PNRGLDD S T 


5981 


1 


2519 


GRRHSAAMERPWGAADGLSRWPHGLGLLLLLQLLPPSTLSQDRL 
DAPPPPAAPLPRWSGPIGVSWGLRAAAA\GGAFPRGGRWRRSAP 
G\EDEECGR^DFVAKLANNTHQHVFDDLRGSVSLSWVGDSTGV 
ILVLTTFHVPLVIMTFGQSKLYRSEDYGKNFKDITDLINNTFIR 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine / D-Aspartic Acid # E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H-Histidine, I=Isoleucine , K-Lysine, 
L-Leucine, M=Methionine, N*Asparagine, 
P»Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, **=stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








TE FGMAI GPENSG KWLTAE VSGGSRGGR I FRS SDFAKNFVQTD 
LPFHPLTQMMYSPQNSDYLLALSTENGLWVSKNFGGKWEEIHKA 
VCLAKWGSDNTI FFTTYANGS CKADLGALELWRTSDLGKS FKT I 
GVKIYSFGIjGGRFLFASVMADKI)TTRRIHVSTDQGDTWSMAQLP 
S VGQEQFYS ILAANDDMV FMHVDEPGDTGFGTIFTS DDRGIVYS 
KS LDRH L Y TTTGGE TD FTNVT S LRGV Y I TS VLS EDNS IQTMITF 
DQGGRWTHLRKPENSECDATAKNKNECSLHIHASYS ISQKLNVP 
MAPLSEPNAVGIVIAHGSVGDAISVMVPDVYISDDGGYSWTKML 
EGPHYYT I LDSGG 1 1 VAI EHSSRPINVI KFSTDEGQ CWQTYT FT 
RDP I YFTGLAS E PGARS MNI S I WGFTES FLTSQWVS YT I DFKD I 
LERNCEEKDYTIWLAHSTDPEDYEDGCIliGYKEQFIiRLRKSSVC 
QNGRDYWTKQPSICLCSLEDFLCDFGYYRPENDSKCVEQPELK 
GHDLEFCLYGREEHLTTNGYRKIPGDKCQGGVNPVREVKDLKKK 
CTSNFLS PEKQNS KSNS VP 1 1 LAI VGLMLVTWAGVL I VKKYVC 
GGRFLVHLYSVLQQH\AEA\NGVDGVDALDTASHTNKSGYHDDS 
DEDLLE 


5982 


56 


2316 


ATRP PRGS S WCRQFS RTASAAPGRSNMLRI PVRKALVGLS KS P K 
GCVRTTATAASNL I E VFVDGQS VMVEPGTTVLQACE KVGMQI PR 
FCYHERX.SVAGNCRMCLVEIEKAPKWAACAMPVMKGWNILTNS 
E KS KKAREG VME FLLANH PLD C P I CDQGGE CD LQDQ S MM FGNDR 
S RFLEGKRAVEDKNI GPLVKTIMTRCIQCTRC I RFAS E I AG VDD 
LGTTGRGNDMQVGTYIEKMFMSELSGNIIDICPVGALTSKPYAF 
TARPWETRKTESIDVMDAVGSNIWSTRTGEVMRILPRMHEDIN 
EEWIS DKTRFAYDGLKRQRLTE PMVRNE KGLLTYTS WEDAIiSRV 
AGMLQSFQGKDVAAIAGGLVDAEALVALKDLLNRVDSDTLCTEE 
VFPTAGAGTDLRSNYLLNTTIAG VEEADWLLVGTNPR FEAP L F 
NARIRKSWLHNDLKVALIGSPVDLTYTYDHLGDS PKILQDIASG 
S KP FS Q VL KE AKKPMWLG S S ALQRNDG AA I LAAVS S IAQKIRM 
TSGVTGDWKVMNI LHR I ASQVAALDLG YKPG VEAI RKNP PKVLF 
LLGADGGCI TRQDLPKDCFI I YQGHHGDVGAPI ADVILPGAAYT 
EKS ATYVNTEGRAQQTKVAVTPPGLAREDWKI IRALSE I AGMTL 
PYDTL \ DQVRNRLEEVS PNLVRYDDIEG\ ANYFQQANELS KLVN 
QQLLADPLVP PQLTMKDF YMTDS I SRASQTMAKCVKAVTEGAQA 
VEEPSIC 


5983 


248 


1763 


EARGDGGRRRHRASGRRAGRGEP \ AGLKSQGQRAVPKRAVARGG 
RQ \ YS AAIALLE P AGSE I ADDLS I LYSNRAACYLKEGNCSGC IQ 
IX^RALELHPFSMKPLLRRAMAYETLEQYGKAYVDYKTVLQIDC 
GLQLANDSVNRLSRILMELDGPNWREKLSLIPAVPASVPLQAWH 
PAKEM I S KQAGDSS SHRQOG I TDEKTFKALKEEGNQCVNDKNYK 
DALS KYS ECL K I NNKECAI YTNRALC YL KLCQ FE EAKQD CDQ AL 
QLADGNVKAFYRRALAHKGLKNYQKSLIDLNKVILLDPSIIEAK 
MELEEVTRLLNLKDKTAPFNKEKERRKIEIQEVNEGKEEPGRPA 
GEVSTGCLASEKGGKSSRSPEDPEKLPIAKPNNAYEFGQIINAL 
STRKDKEACAHLLAITAPKDLPMFLSNKLEGDTFLLLIQSLKNN 
LIEKDPSLVYQHLLYLSKAERFKMMLTLISKGQKELIEQLFEDL 
SDTPNNHFTLBD I QALKRQYEL 


59B4 


755 


1193 


SSVCMACTYVSNLGKKQRSVSFLASGLMRVSTGPELRLHHSFVL 
TGDVGRRI CRLLVGLFTKGDTSS KRVHPFS PGPCFLLCDLARVG 
S S PKI NVS P FYQN \ QTSTQRS CTVF VWQRCSLVG P FQVTVFTM Y 
FHHSLRSISRFSSG 


5985 


22 


14 0 8 


KXvHKru 1 M-tl. J\J\K..K 1 VKKoK>\KKJJljAUAfciK1^0VfaEiKtiIJoGR 

rrpnpsipsaaagmshiqippgltellqgytvevlrqqppdlve 
favey ftrlrearapas vlpaatprqs lghpppe pg pdrvadak 
g d s es e e d e dl e vp vp srfnrr vs vcaetyn pde ee edtdpr v i 
hpktdeqrcrlqeacki)illfknldqeqlsqvldamferivkad 
eh vi dqgddgdnfyvi ergt yd i lvtkdnqtrs vggydnrgs fg 
e lalm ynt p raat i vatseg s l wg ldr vtfrr 1 1 vknnakkrkm 
fesfiesvpllkslevsermkivdvigekiykr/dgeriitoge 
k\adsfyiiesgevsilirsrtksnkdggnqeveiarchkgqyf 
gelalvtnkpraasayavgdvkclvmdvqaferllgpcmdimkr 
n i sh yeeql vkmfgs s vdlgnlgq 
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SSQ 
3D 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenyl alanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M*Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T«Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5986 


1806 


484 


DAWKSTS LTFHWKLWGRHRGRRRGLAHPKNHLS PQQGGATPQVP 
SPCCRFDSPRGPPPPRLGLLGALMAEDGVRGSPPVPSGPPMEED 
GLRWTFKSPLDPDSGLLSCTLPNGFGGQSGPEGERSLAPPDASI 
L I SNVCS I GDHVAQEL FQGS DLGMAEEAE RPGE K\ AGQHS PIjRE 
EHVTCV0S I LDEFLQT\ YGSL I PLSTDE WEKLED I FQQEFSTP 
SRKGLVLQLIQSYQRMPGNAMVRGFRVAYKRHVLTMDDLGTLYG 
QNWLNDQVMNM YGDLVMDTVP EK\ VHFFNS FFY \DKLRTKGYDG 
VKRWTKNVD I FNKE LLL I P IHLE VHWS L I S VD VRRRTI TYFDS Q 
RTLNRRCPKH I AKYLQAE AVKKDRLDFHQGWKGYFKMNVARQNN 
DS DCGAF VLQ YCKH LALS Q P FS FTQQDM P KLRRQ I YKELCHCKL 
TV 


5987 


1806 


484 


DAWKS TS LTFHWKLWGRHRGRRRGLAHPKNHLS PQQGGATPQVP 
SPCCRFDSPRGPPPPRLGLLGALMAEDGVRGSPPVPSGPPMEED 
GLRWTPKSPLDPDSGLLSCTLPNGFGGQSGPEGERSLAPPDASI 
LISNVCSIGDHVAQELFQGSDLGMAEEAERPGEK\AGQHSPLRE 
EHVTCVQS I LDE FLQT \ YGS L I PLSTDEWEKLEDI FQQE FSTP 
SRKGLVLQL I QS YQRM PGNAMVRGFRVA YKRHVLTWDDLGTLYG 
QNWLNDQVMNMYGDLVMDTVPEK\VHFFNSFFY\DKLRTKGYDG 
VKRWTKNVDIFNKELLLIPIHLEVHWSLISVDVRRRTITYFDSQ 
RTLNRRCPKHIAKYLQAFAVKKDRLDFHQGWKGYFKMNVARQNN 
DSDCGAFVLQYCKHLALSQPFSFTQQDMPKLRRQ I YKELCHCKL 
TV 


5988 


1292 


410 


FKKYFLS FLGLLES SHS RDR I HNLVLMFLLATHNLVWW FTCRFQ 
RLDC I YLNAG I MPNPQLN I KALLFGLFS \AEGLLTQGDKI TADG 
LQEVFETDVFGHFILIRELEPLLCHSDNPSQLIWTSSRNARKSN 
FSLED FQHS KGKE P YSS S KYATDLLS VALNRNFNQQGLYSNVAC 
PGTALTN LT YG I L P P F I WTL LMPA I LLLRFFANAFTLTP YNGTE 
ALVWLFHQKPESLNPLIKYLSATTGFGRNYIMTQKMDLDEDTAE 
KFYQKLLELEKH I RVTIQKTDNQARLSGSCL 


5989 


194 


2610 


AMDFPQHSQHVLEQLNQQRQLGLLCDCTFVVDGVHFKAHKAVLA 
ACS E Y FKMLFVDQKD WHLD I SNAAGLGQVLE FM YT AKL S LS PE 
NVDDVL\ AVATFLQMQDI I TACHALKS LAE PATS PGGNAEALAT 
EGGDKRAKEEKVATSTLSRLEQAGRSTPIGPSRDLKEERGGQAQ 
SAASGAEQTEKADAPREPPPVELKPDPTSGMAAAEAEAALSESS 
EQEMEVEPARKGEEEQKEQEEQEEEGAGPAEVKEEGSQLENGEA 
P E ENENE E S AGTD SGQELG SEARGLRS GT YGDRTE S KAYGS V I H 
KCEDCGKEFTHTGNFKRHIRIHTGEKPFSCRECSKAFSDPAACK 
AHEKTKS PLKPYGCEECGKS YRLI SLLNLR KKRHS G EAR YR CED 
CGKLFTTS GNLKRHQLVHSGE KP YQCD YCGRS FSDPTS KMRHLE 
THDTDKEHKCPHCDKKFNQVGNLKAHLKIHIADGPLKCRECGKQ 
FTTSGNLKRHLRIHSGEKPYVCIHCQRQFADPGALQRHVRIHTG 
E KPCQCVM CG KAFTQAS S L I AHVRQHTGEKP YVCERCG KRFVQ S 
SQLANHI RHHDNIRPHKCSVCS KAFVNVODLSKHI I IHTGEKP Y 
LCDKCGRGFNRVDNLRSHVKTVHQGKAG I KI LEPEEGS EVSWT 
VDD^WTLATEAIAATAVTQLTVVP VGAAVTAD E TE VLKAE I S KA 
VKQVQEEDPNTHILYACDSCGDKFLDANSLAQHVRIHTAQALVM 
FQTDADFYQQYGPGGTWPAGQVLQAGELVFRPRDGAEGQPALAE 
TSPTAPECPPPAE 


5990 


2 


4700 


FGPGPDSGGGARGSGWGSRSQAPYGTLGAVSGGEQVLLHEEAGD 
S G F VS LS RLG PS LRDKDLE ME ELM LQDETLLGTMQ S YMDASL I S 
LIEDFGSLGEVEMSLPDPSWDFSPPSFLETSSPKLPSWRPPRSR 
PRWGQSPPPQQRSDGEEEEEVAS FSGQILAGELDNCVSS I PDFP 
MHLACPEEEDKATAAEMAVPAAGDES ISSLSELVRAMHPYCLPN 
LTHLASLEDELQEQPDDLTLiPEGCWLEIVGQAATAGDDLEIPV 
VVRQVSPGPRPVLLDDSLETSSALQLLMPTLESETEAAVPKVTL 
CSEKEGLSLNSEEKLDSACLLKPREWEPWPKEPQNPPANAAP 
GSQRARKGRKKKS KEQ PAACVEGYARRLRS SSRGQSTVOTEVTS 
QVDNLQKQPQEELQKESGPLQGKGKPRAWARAWAAALENS SPKN 
LERSAGQSS PAKEGPLDLYP KLADTI QTNP I PTHLSLVDS AQAS 
PMP VDS VEADP TAVGP VLAG P VP VD PGL VDLAS TS SE L VE P L PA 
E P VL INP VLADS AAVDPAWPI SDNL P P VDAVPSGPAPVDLALV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F*Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine # 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y=Tyrosine, X- Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








DPVPNDLTPVDPVLVKSRPTDPRRGAVSSALGGSAPQLLVESES 
LiD P P KTI I P EVKEWDS LK I ES GT S ATTH E AR PRPLS LS E YRRR 
RQQRQAETEERSPQPPTGKWPSLPETPTGLADIPCLVIPPAPAK 
KTALQRS PET PLEI CLVP VGPS P AS PSPEP P VSKPVASS PTEQV 
PSQEMPLLARPSPPVQSVSPAVPTPPSMSAALPFPAGGLGMPPS 
LPPPPLQPPSLPLSMGPVLPDPFTHYAPLPSWPCYPHVSPSGYP 
CLPPPPTVPLVSGTPGAYAVPPTCSVPWAPPPAPVSPYSSTCTY 
GPLGWGPGP QHAPFW S T VP PP P L P P AS IGRAVPQ P KME S RGTP A 
GPPENVLPLSMAPPLSLGLPGHGAPQTEPTKVEVKPVPASPHPK 
HKVSALVQSPQMKALACVSAEGVTVEEPASERLKPETQETRPRE 
KPPLPATKAVPTPRQSTVPKLPAVHPARLRKLSFLPTPRTQGSE 
DWQAFIS E IG I EASDLSS LLEQ FE KSEAKKE CPPPAP ADS LAV 
GN SGG VD I PQE KRPLDRLQAP ELANVAGLT P P AT P PHQLW KPLA 
AVS LLAKAKS P KSTAQEGTLKPEGVTEAKHPAAVRLQEGVHGPS 
R VHVGSGDHD YC \ VRS RT P PKK\ M PALL I P EVGS RWNVKRHQD I 
TIKPVLSLGPAAPPPPCIAASREPLDHRTSSEQADPSAPCLAPS 
SLLS PEASPCRNDMNTRTPPEPSAKQRSMRCYRKACRSAS PSSQ 
GWQGRRGRNSRSVSSGSNRTSEASSSSSSSSSSSRSRSRSLSPP 
HKRWRRSSCSSSGRSRRCSSSSSSSSSSSSSSSSSSSSRSRSRS 
PSPRRRSDRRRRYSSYRSHDHYQRQRVLQKERAIEERRWFIGK 
IPGRMTRSELKQRFSVFGEIEECTIHFRVQGDNYGFVTYRYAEE 
AFAAIESGHKLRQADEQPFDLCFGGRRQFCKRSYSDLDSNREDF 
D P AP VKSKFDS LDFDTLL KQAQKNLRR 


5991 


334 


1379 


RLSSHFSQCSPSIYC\TKFDKQGNVTSFERKKTELYQELGLQAR 
DLRFQHVMSITVRNNRIIMRMEYLKAVITPECLLILDYRNLNLK 
QWLFRELPSQLSGEGQLVTYPLPFEFRAIEALLQYWINTLQGKL 
SILQPLILETLDALGDPKHSSVDRSKLHILLQNGKSLSELETDI 
.KIFKESILEILDEEELLEELCVSKWSDPQVFEKSSAGIDHAEEM 
ELLLEN YYR LADDLSNAARELRVL I DDSQS 1 1 F I NLDS HRNVMM 
RLNLQ LTMG T FS LS LFG LMGVAFGMNLES S LE E DHRI FWL I TG I 
M FMGSGL I WRRLLS FLGR / LARS S I AS YGMKDMVHGG I VEGL 


5992 


2 


609 


AGPDFRLVCGVSGSGFPGGRQGQATEWRPLRPWNGAMEKLRRVL 
SGQDDEEQGLTAQDSQINL/SEVLDASSLSFNTRLKWFAICFVC 
GVFFS I LGTGLLWLPGGI KLFAVFYTLGNLAALASTCFLMGPVK 
QLKKMFEATRLLATI VMLLCFIFTLCAALWVIHKKGLAVLFC I LQ 
FLSMTWYSLSYIPYARDAVIKCCSSLLS 


' 5993 


16S0 


594 


AEGLGS WAVWAGLGWAGRHMEAGGATGALGVGCKLPSAFC F PGS 
SVAMDMFQKVE KIGEGTYGWYKAKNRETGQLVALKKIRLDLEM 
EGVPS TAIRE I SLLKELKHPNI VRLLDWHNBRKLYLVFE FLSQ 
DLKKYMDSTPGS ELPLHL I KS YLFQLLQGVS FCHSHRVIHRDLK 
PQNLLINELGAIKLADFGLARAFGVPLRTYTKEVVTLWYRAPEI 
LLATR FYTTAVD I WS IGC I FAEMVTRKALFPGDS \ EIDQ\LFRI 
FRMLGTPSEDTWPGVTQLPDYKGS FP KWTRKGLEE I VPNLE P EG 
RDLLMQLLQYDPSQRITAKTALAHPYFSSPEPSPAARQYVLQRF 
RH 


5994 


394 


1934 


AGEVQLHVWIRGMRIQPQ/ KAAAI IDLDPDFEPQSRPRSCTWPL 
PRPEIANQPSKPPEVEPDLGEKVHTEGRSEPILLPSRLPEPAGG 
PQPG I LG AVTGPRKGGSRRNAWGNQS YAEL I S QAIESAPE KRLT 
LAQIYEWMVRTVPYFKDKGDSNSSAGWKNSIRHNLSLHSKFIKV 
HNEATGKSS WWMLNPEGGKSGKAPRRRAASMDS S SKLLRGRS KA 
PKKKPSGLPAPPEGATPTSPVGHFAKWSGSPCSRNREEADMWTT 
FRPRS S S NASS VSTRLS PLRPESEVLAEE IPAS VS S YAGGVP PT 
LNEGLELLDGLNLTSSHSLLSRSGLSGFSLQHPGVTGPLHTYSS 
SLFS P AEGPLSAGEGCFS SSQALEALLTSDTP P PPADVLMTQVD 
PILSQAPTLLLLGGLPSSSKLATGVGLCPKPLEAPGPSSLVPTL 
SMIAPPPVMASAPIPKALGTPVLTPPTEAASQDRMPQDLDLDMY 
MENLECDMDNI ISDLMDEGEGLDFNFEPDP 


5995 


2 


2437 


RPPGPGPASGAWLCTRARGSAAFVPPLPRPPSRGARRRRRLPGR 
GVAALRRGPGSAPGLPRGRAERSAAGSGRGPSREERGAAAAAAA 
AEMMEELHSL\DP\RRQELLEARF\TGLGVSKGPLNSESSNQSL 
CS VGS LS DKE VETP E KKQNDQRNR KRKAE P YE TS QG KGT PRGH K 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G-Glycine, 
H-Histidine, I-Isoleucine, K*=Lysine, 
L*Leucine # M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y*Tyrosine, X»Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\spossible nucleotide insertion) 








ISDYFERRVEQPLYGLDGSAAKEATEEQSALPTLMSVMLAKPRL 
DTEQLAQRGAGLCFTFVSAQONS PSSTGSGNTEHS CS SQKQ I S I 
QHRQT \QSDLTIEKIS ALENS KNS DLE KKEGR I D DLLRANCDLR 
RQI \DEQQKMLEKYK\ ERLNRCFDNEPRNFLI EKSKQEKMACRD 
KS MQDRLRLGHFTTVRHGAS FTEQWTDG YAFQNL I KQQERINS Q 
REEIERQRKMLAKRKPPAMGQAPPATNEQKQRKSKTNGAENETL 
TLAEYHEQEE I FKLRLGHLKKEEAE I QAELERLERVRNLHIREL 
KRIHNEDNSQFKDHPTLNDRYLLLHLLGRGGFSEVYKAFDLTEQ 
RYVAVKIHQLNKNWRDEKKENYHKHACREYRIHKELDHPRIVKL 
YDYFSLDTDSFCTVLEYCEGNDLDFYLKQHKLMSEKEARSIIMQ 
I VHALKYLNE I KPP 1 1 H YDLKPGN ILLVNGTACGE I KITDFGLS 
KIMDDDSYNSVDGMELTSQGAGTYWYLPPECFWGKEPPKISNK 
VDVWS VGVI F YQCLYGRKPFGHNQSQQD I LQENT I LKATEVQFP 
PKPWTPEAKAFIRRCLAYRKEDRIDVQQLACDPYLLPHIRKSV 
STS S P AGAAI AS TS GAS NNSSSN 


5996 


1612 


981 


DQQACLLGLMLTLE FG I LEFDPS W I GS WTQR/ S WVS WRS RPG CE 
LFS I WFGS I VNEGYLNSASEGEEFCI YNRNPNACS YGVAVGVL 
AFLTCLLYLALDVYFPQISSVKDRKK\AVLSGHPVVSGEPHPAA 
FWAFLWFTGDS C YL \ ANQWQVS KPKDNPLNEGTDAS PGR PS P FS 
FFS I FTWS LTAALAVRRFKDLS FQEEYSTLFP\ASAQP 


5997 


1612 


981 


DQQACLLGLMLTLEFGILEFDPSWIGSWTQR/SWVSWRSRPGCE 
LFS IWFGS I VNEG Y LNS AS EG E E FC I YKRNPNACS YGVAVGVL 
AFLTCLLYLALDVYFPQISSVKDRKK\AVLSGHPWSGEPHPAA 
FWAFLWFTGDS CYL\ANQWQVS KPKDNPLNEGTDAS PGRPS PFS 
FFS I FTWS LTAALAVRRFKDLS FQEE YSTLFP \ ASAQP 


5998 


1612 


981 


DQQACLLGLMLTLEFGILEFDPSWIGSWTQR/SWVSWRSRPGCE 
LFS I WFGS I VNEGYLNSASEGEEFCI YNRNPNACSYGVAVGVL 
AFLTCLLYLALDVYFPQ I SS VKDRKK\AVLSGHPWSGEPHPAA 
FWAFLWFTGDS CYL\ANGWQVS KP KDNPLNBGTDAS PGR PS PFS 
FFS IFTWSLTAALAVRRFKDLSFQEEYSTLPP\ASAQP 


5999 


2 


1790 


RPPMEKARRGGDGVPRGPVLHIVWGFHHKKGCQVEFSYPPLIP 
GDGHDSHTLPE E WKYLP FLALPDGAHNYQEDTVFFHLP PRNGNG 
ATVFG I S CYR\Q I EAKALKVRQAD I TRETVQKS VCVLS KLPLYG 
LLQAKLQL I THA Y FEE KDFSQ I S I LKEL YEHMNS S LGGAS LEGS 
QVYLGLSPRDLVLHFRHKGLILFKLILLEKKVLFYISPVNKLVG 
ALMTVLSLFPGMIEHGLSDCSQYRPRKSMSEDGGLQESNPCADD 
FVSAS TADVSHTNLGT I RKVMAGNHGEDAAMKTEE PLFQVE DSS 
KGQEPNDTNQYLKPPSRPSPDSSESDWETLDPSVLEDPNLKERE 
QLGSDQTNLFPKDSVPSESLPITVQPQANTGQWLIPGLISGLE 
EDQYGMPLAI FT KG YL C L P YMALQQ HHL LS D VTVRG FVAGATN I 
LFRC^KHL^DAlVEVEEALIQIHDPELRKIiLNPTTADLRFADYL 
VRHVTENRDDVFLDGTGWEGGDEWIRAQFAVYIHALLAATLQLV 
LFRIVNVAKKIGNVMVTT\SRNWQTGK\AVGQSVGGAFS\SAK 
TA\MSSWLSTFTTSTSQSLTEPPDBKP 


6000 


101 


1561 


TE PCRTAEN CT ATMS ENNKNS LES S LRQLKCH FTWNLMEG ENS L 
DDFEDKVFYRTEFQNREFKATMCNLLAYLKHLKGQNEAALECLR 
KAEELIQQEHADQAEIRSLVTWGNYAWVYYHMGRLSDVQIYVDK 
VKHVCE KFSS P YR I ES PELDCEEGWTRLKCX3GNQNERAKVCFEK 
ALEKKPKNPEFTSGLAIASYRLDNWPPSQNAIDPLRQAIRLNPD 
NQYLKVLLALKLHKMREEGEEEGEGEK\LVEEALEKAPG\VTDV 
LRSAA\KFYRGKDEPDKAIELLKKALEYIP\NNAYLHCQIGCCY 
RAKVFQVMNLRENGMYGKRKLLELIGHAVAHLKKADEANDNLFR 
VCS I LASLHALADQYEDAE YYFQKEFS KELTPVAKQLLHLRYGN 
FQL YQM KCE DKA I HH F I EG VK I NQKSRE KE KMKD KLQKI AKMRL 
S KNG ADSEALHVLAFLQELNE KMQQADEDS ERGLESGS LI PSAS 
SWNGE 


6001 


176 


1038 


AFAHSPSRGHRHTHIHTPRHTPRCTMAESHLQSSLITASQFFEI 
WLHFD ADGSG YLEG KELQNL I QE LQQARKKAGLE LS P EMKTFVD 
QYGQRDDGKIGIVELAHVLPTEENFLLLFRCQQLKSCE\EFMKT 
WRKYDTDHSGFIETEELKNFLKBLLEKANKTVDDTKLAEYTDLM 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, I=Isoleucine / K=Lysine, 
L=Leucine, M-Methionine, N«Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LKLTOSNNIX^KLELTEMARLLPVQENFLLKFQGIKWCGKEFNKA - 

FELYDQDGNGYIDENELDALLKDLCEKNKQDLDINNITTYKKNI 

MALSDGGKLYRTDLALILCAGDN 


6002 


977 


81 


lappggglhipprtplshsrpppshhaphpsplplppadlhphs 
smaqrsdlleldcqltrdrvwvs hdenlcrqsglnrdvgs ld p 
edlplykeklevyfspghfahgsdrrmvrledlfqrfprtpmsv 
eikgkneelireq/vlvrrydrneitiwasekssvmkjcckaanp 
emplsftisrgfwvllsyylgllpfipipekfffcflpniinrt 
yfpfscsclnqllawskwlimrkslirhleergvqwfwclne 
es dfeaafs vgatgvi tdyptalrhyldnhgpaarts 


6003 


140 


4098 


giujrafrgmrrlickricdyksfddeesvdgnrpssaasafkvp - 
apktsgnpansarkpgsagg p kvgagaskeggagavdeddfika 

FTD VPS IQ I YS S RELEE TLNKI R E I L SDDKHD WDQRANALKKI R 

sllvagaaqydcffqhlrlldgalklsakdlrsqvweacitva 
hls tvlgnkfdhgaeai vptlfnlvpnsakvmatsgcaairfi i 
rhthvprliplxtsnctsksvpvrrrsfefldlllqewqthsle 
rjiaavlvetikkgihdadaearvearktymglrnhfpgeaetly 
nslepsyqkslqtylkssgsvaslpqsdrsssssqeslnrpfss 
kwstanpstvagrvsagsskasslpgslqrsrsdidvnaaagak 
ahhaagqsvrsgrlgagalnagsyasledtsdkldgtasedgrv 
rakls aplagmgnakadsrgrsrtkmvsqsq pgsrsgs pgr vlt 

TTALSTVSSGVQRVLVNSASAQKRSKIPRSQGCSREASPSRLSV 
ARSSRIPRPSVSQGCS REASRES S RDTS P VRS FQ P LAS RHH S RS 
TG AL YAP E VYGAS GPGYGISQSS RLS S S VS AMR VLNTG S DVE EA 
VADALLLGDIRTKKKPARRRYESYGMHSDDDANSDASSACSERS 
YSSRNGS I PTYMRQT\ ED V\AEVLNRCASSNWSERKEGLLGLQN 
LL KNQRTLS R VE L KRLCE I FTRM FAD PHGKR V FS M FLETL VD F I 
QVHKDDLQDWLFVLLTQLLKKMGADLLGSVQAKVQKALDVTRES 
FPNDLQFNILMRFTVDQTQTPSLKVKVAILKYIETLAKQMDPGD 
FINSSETRIAVSRVITWTTEPKSSDVRKAAQSVLISLFELNTPE 
FTMLLGALPKTFQDGATKLLHNHLRNTGNGTQSSMGSPLTRPTP 
RS P ANWS S PLTS PTNTSQNTLS PS AFD YDTENMNS ED I YS S LRG 
VTEAIQNFSFRSQEDMNEPIjKRDSKKDDGDSMCGGPG\MSDPRA 
GGDATDSSQTAL\DNKASLLHSMPTHSSPRSRDYNPYNYSDSIS 
PFNKSALKEAMFDDDADQFPDDLSLDHSDLVAELLKELSNHNER 
VEERKIALYELMKLTQEESFSVWDEHFKTILLLLLETLGDKEPT 

IRALALKVLRE ilrhqparfknyaeltvmktleahkdphkewr 

S AE E AAS V \ LATS I \ S PEQC I KVLCP 1 1 QTADY P I NLAAI KMQT 

KVIERVS ketlnlllpe IMPGLIQG ydnsess vrkacvfclvav 

HAVIGDELKPHLSQLTGSKMKLLNLYIKRAQTGSGGADPTTDVS 
GQS 


6004 


140 


4098 


gklrafrgmrrlickricdyksfddeesvdgnrpssaasafkvp 

A P KTSGNPANS ARKPGS AGG P KVGAGAS KEGGAGAVDE DD F I KA 

FTDVPSIQIYSSRELEETLNKIREILSDDKHDWDQRANALKKIR 

S LL VAG AAQ YDCF FQHLRLLDGAL KL S AKDLRSQ WREAC I TVA 

HLS TVLGNKFDHGAEAI VPTLFNLVPNSAKVMATSGCAAIRFI I 

RHTHVPRLIPLITSNCTSKSVPVRRRSFEFLDLLLQEWQTHSLE 

RHAAVLVETI KKG IHDADAEARVEARKT YMGLRNHFPGEAETLY 

NSLEPSYQKSLQTYLKSSGSVASLPQSDRSSSSSQESLNRPFSS 

KWSTANPSTVAGRVSAGSSKASSLPGSLQRSRSDIDVNAAAGAK 

AHHAAGQSVRSGRLGAGALNAGSYASLEDTSDKLDGTASEDGRV 

RAKLSAPLAGMGNAKADSRGRSRTKMVSQSQPGSRSGSPGRVLT 

TTALSTVSSGVQRVLVNSASAQKRSFCIPRSQGCSREASPSRLSV 

ARSSRIPRPSVSQGCSREASRESSRDTSPVRSFQPLASRHHSRS 

TGALYAPE VYGAS GPGYGISQSSRLSSS VS AM RVLNTG SD VE EA 

VADALLLGDIRTKKKPARRRYESYGMHSDDDANSDASSACSERS 

YS SRNGS I PT YMRQT \ EDV\AE VLNR CAS SNWS ER KEGLLGLQN 

LLKNQRTLS R VE LKRL CE I FTRMFAD PHG KR VFSM F L ETLVD F I 

QVHKDDLQDWLFVLLTQLLKKMGADLLGSVQAKVQKALDVTRES 

FPNDLQFNILMRFTVDQTQTPSLKVKVAILKYIETLAKQMDPGD 

FINSSETRLAVSRVITWTTEPKSSDVRKAAQSVLISLFELNTPE 
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amino acid 
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Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P-Proline, Q-Glutamine, R=Arginine, 
S-Serine, T-Threonine, V=Valine, 
W=Tryptophan ( Y=Tyrosine, X=Unknown, * -Stop 
Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 








RSPANWSSPLTSPTNTSQNTLSPSAFDYDTENMNSEDIYSSLRG 
VTEAIQNFSFRSQEDMNEPLKRDSKKDDGDSMCGGPG\MSDPRA 
GGDATDS SQTAL\DNKASLLHSMPTHS S PRSRDYNP YNYSDS I S 
PFNKSALKEAMFDDDADQFPDDLSLDKSDLVAELLKELSNHNER 
VEERKIALYELMKLTQEESFSWJDEHFKTILUJjIiETLGDKEPT 
I RALALKVLRE I LRHQ PARFKNYAELTVMKTLEAHKDPHKEWR 
S AEEAAS V\ LATS I \ S PEQC I KVLC P 1 1 QTADYP INLAAI KMQT 
KVIERVSKETIiNLLLPEIMPGLIQGYDNSESSVRKACVFCLVAV 
HAVI GDELKPHLSQLTGS KMKLLNL YI KRAQTGSGGAD PTTDVS 
GQS 


6005 


133 


5955 


RSSGRRQEQLGQFPGRERKGMASGLGSPSPCSAGSEEEDMDALL 
NNSLPPPHPENEEDPEEDLSETETPKLKKKKKPKKPRDPKIPKS 
KRQKKERMLLCRQLGDSSGEGPEFVEEEEEVALRSDSEGSDYTP 
G KKKKKKLGP KKE KKS KS KRKEE E E EDDDDDDDS KE P KS S AQLL 
EDWGMEDIDHVFSEEDYRTLTNYKAFSQFVRPLIAAKNPKIAVS 
KMMMVLGAKWREFSTNNPFKGSSGASVAAAAAAAVAVVESMVTA 
TE VAP PPPP VEVP IRKAKTKEGKGPNARRKPKGS PRVPDAKKPK 
PKKVAPLKIKLGGFGSKRKRSSSEDDDLDVESDFDDASINSYSV 
S DGS TS RS S RS RKKLRTTKKKKKG EEEVTAVDGYETDHQD YCE V 
CQQGGEI ILCDTCPRAYHMVCLDPDMEKAPEGKWSCPHCEKEG I 
QWEAKEDNS EGEEILE EVGGDLEEEDDHHME FCRVCKDGGEIiLC 
CDTCPSSYHIHCLNPPLPEIPNGEWLCPRCTCPALKGKVQKILI 
WKWGQPPSPTPVPRPPDADPNTPSPKPLEGRPERQFFVKWQGMS 
YWHCSWVSELQLELHC\QVMFRNYQRKNDMDEPPSGDFGGDEEK 
S\RKRKNKDPKFAEMEERFYRYG I KPEW\MMIHRILNHSVDKKG 
HVHYL I KWRDLP YDQAS WES EDVE I QDYDLFKQS YWNHRELMRG 
EEGRPGKKLKKVKIiRKLERPPETPTVDPTVKYERQPEYLDATGG 
TLH P YQMEG LNW LRF S WAQGTDT I LAD EMGLGKTVQTA VFLY S L 
YKEGHSKGPFLVSAPLSTIIN\WEREFEMWAPDMYV\VTYVGDK 
DSRAI I RENEFS \ FEDNAIRGGKKASRMKKEAS VKFHVLLTS YE 
L I TIDMAILGS I DWACL I VDEAHRLKNNQSKFFRVLNGYS LQHK 
LLLTGT PLQNNLE ELFHLLNFLT PER FHNLEG FLEEFAD I AKED 
Q I KKLHDMLG \ PHMLRRLKAD VFKNM P S KTE L I V\ RVELS PM \ Q 
KKY YK\ Y I LHS KFLKALN\ ARGGGNQ VS LLNVVMDLKKCCNH P Y 
LFPVAAMEAPKMPNGMYDGSALIRASGKLLLLQKMLKNLKEGGH 
RVLIFSQMTKMLDLLEDFLEHEGYKYERIDGGITGNMRQEAIDR 
FNAPGAQQFCFLLSTRAGGLGINLATADTVI IYDSDWNPHNDIQ 
AFSRAHRIGQNKXVMIYRFVTRASVEERITQVAKKKMMLTHLW 
RPGLGSKTGSMSKQELDDILKFGTEELFKDEATDGGGDNKEGED 
S S V I H YDDKA I E RLLDRNQDETEDTELQGMNE YLS S FKVAQ YW 
REEEMGEEEEVERE I IKQEES VDPD YWEKLLRHHYEQQQEDLAR 
NLGKGKRIRKQVNYNDGSQEDRDWQDDQSDNQSDYSVASEEGDE 
DFDERS EAPRRPSRKGLRNDKDKPLP PLLARVGGN I EVLGFNAR 
QRKAFLNAI MRYG MP PQDAFTTQWL VRD LRGKS E KE FKAYVS L F 

EFEHVNGRWSMPELAEVEENKKMSQPGSPSPKTPTPSTPGDTQP 
NTPAP VP PAEDG I KI EENS LKEEES I EGEKEVKS TAPETAI ECT 
QAPAPASEDEKVWEPPEGEEKVEKAEVKERTEEPMETEPKGKG 
AADVEKVEE KSAI DLTP I WEDKEEKKEEEEKKE VMLQNGETPK 

YEIWHRRHD YWLLAGI INHGYARWQDIQNDPRYAI LNEPFKGEM 
NRGNFLE I KNKFLARR FKLLEQALVI EEQLRRAAYLNMSED PSH 
PSMALNTRFAEVECLAESHQHLSKESMAGNKPANAVLHKVLKQL 
EELLSDMKADVTRLPATIARIPPVAVRLQMSERNILSRLANRAP 
EPTPQQVAQQQ 


€006 


1 


965 


DNDFLRNTVHRHE P P VTAEP I RLLAENEDWWDKPS S IPVH P C 
GRFRHNTVIFILGKEHQLKELHPLHRLDRIjTSGVLMFAKTAAVS 
ER IHEQVRDRQLE KE YVCR VEGE FPTEEVTCKEP I LWSY KVGV 
CRVDPRGKPCETVFQRLSYNGQSSWRCRPLTGRTHQIRVHLQF 
| LGH P I LNDP I YNS VAWGPSRGRGGYI PKTNEELLRDLVAEHQAK 
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amino acid 
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Predicted end 
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location 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q-Glut amine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown f *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








QSIaDVLDLCEGDLSPGLTDSTAPSS ELGKDDLBELAAAA \ QKME 
E VAEAAPQELD T I ALAS E KAVETD VMNQ \ RQT\TLCR VP AG ATG 
SLAPRPCDVPTCPTL 


6007 


3 


2351 


HELGQVE YVFTDKTGTLTENEMQFRECS INGMKYQE INGRLVPE 
GPTPDSSEGNliSYLSSLSHLNNLSHLTTSSSFRTSPBNETELIK 
EHDLFFKAVSLCHTVQINNVQTDCTGDGPWQSNLAPSQLEYYAS 
SPDEKALVEAAARIGIVFIGNSEETMEVKTLGKLERYKLLHTLE 
FDSDRRRMSVIVQAPSGEKLLFAKGAESSILPKCIGGEIEKTRI 
HVDEFALKGLRTLCIAYRKFTSKEYEEIDKRIFEARTALQQR\E 
EKIAAVFQFIEKDLILLGATAVEDRLQDKVRETIEALRMAGIKV 
WVLTGDKHETAVSVSLSCGHFHRTMNILELINQKSDSECAEQLR 
QLARR I TEDHVI QHGLWDGTS LS LALREHE KLFMEVCRNCSAV 
LC CRMAPLQ KAKV I RL IKISPEKPI TIiAVGDGANDVSM I Q E AHV 
GIGIMGKEGRQAARNSDYAIARFKFLSKLLFVHGHFYYIRIATL 
VQ Y F FY KNVC FIT PQFL YQ FYCL FS QQTL YDS VYLTL Y \ N I C FT 
SLP I LI YSLLEQHVDPHVLQNKPTLYRD ISKNRLLS I KTFLYWT 
I LG FS HAF I F F FG S YLL I GKDTS LLGNGQMFGNWT FGTLVF T VM 
VITVTVKMALETHFWTWINHLVTWGS I IFYFVFSLFYGG I LWPF 
LGSQNMYFVFIQLLSSGSAWFAIILMVVTCIiFLDIIKKVFDRHL 
HPTSTEKAQLTETNAG I KCLDSMCCFP EGEAACAS VGRMLERVI 
GRCS PTHI SRSWSASDPFYTNDRS I LTLSTMDSSTC 


6008 


4554 


1089 


AGVRRAGARRG PGRALP AGATAVP PPS ARRRRRCPAPEHAG P AR 
ASRPSQETMFQLPVNNLGSLRKARKTVKKILSDIGLEYCKEHIE 
DFKQFEPNDFYLKNTTWEDVGLWDPSLTKNQDYRTKPFCCSACP 
FSSKFFSAYKSHFRNVHSEDFENRILLNCPYCTFNADKKTLETH 
IKIFHAPNASAPSSSLSTFKDKNKNDGLKPKQADSVEQAVYYCK 
KCTYRDPLYEIVRKHIYREHFQHVAAPYIAKAGEKSIiNGAVPLG 
SNAREES S I HCKRCLFMPKS YEALVQHVI EDHERIGYQVTAMIG 
HTNVWPRSKPLMLIAPKPQDKKSMGLPPRIGSLASGNV\RSLP 
SQQMVNRLS I PKPNLNSTGVNMMSS VHLQQNNYGVKS VGQG YS V 
GQSMRLGLGGNAP VS I PQQSQS VKQLLPSGNGRS YGLGSEQRSQ 
APARYSLQSANAS SLSSGQLKS PSLSQSQASRVLGQS SSKPAAA 
ATGPPPGNTSSTQKWKICTICNELFPENVYSVHFEKEHKAEKVP 
AVANYIMKIHNFTSKCLYCNRYLPTDTLLNHMLIHGLSCPYCRS 
TFNDVEKMAAHMRMVHIDEEMGPKTDSTLSFDLTLQQGSHTNIH 
LLVTTYNLRDAPAESVAYHAQNNPPVPPKPQPKVQEKADI PVKS 
SPQAAVPYKKDVGKTLCPLCFS ILKGP I SDALAHHLRERHQVIQ 
TVHPVEKKLTYKCIHCLGVYTSNMTASTITLHLVHCRGVGKTQN 
GQDKTNAP S RLNQ S P S LAPVKRT YEQM E FPLLKKR KLDDDSDS P 
SFFEEKPEEPVVLALDPKGH\EDDSYEARKSFLTKYFT\KQPYP 
TRREIEKLAASLWV\WK\SDIASHFSNKRKKCVRDCEKYKPGVL 
LGFNMKE LNKVKHEMDFDAEG L FENHDE KDS RVNAS KTAD KKLN 
LGKEDDSSSDSFENLEEESNESGSPFDPVFEVEPKISNDNPEEH 
VLKVIPEDASESEEKLDQKEDGSKYETIHLTEEPTKLMHNASDS 
EVDQDDWEWKDGASPSESGPGSQQVSDFEDNTCEMKPGTWSDE 
SSQSEDARSSKPAAKKKATMQGDREQLKWKNSSYGKVEGFWSKD 
QSQWKNAS ENDERLSNPQI EWQNST I DS EDGEQFDNMTDGVAE P 
MHGS LAG VKLS SQQA 


6009 


4272 


1534 


CHGLQHLTPFRELNLSLQG*EPH*AA*QAVRSEEKSIC*GSPSC 
HLVLGVLVPVARQSSHSAGPAQSAFR * TGTGSGTPKAAEQSGYW 
EAYTLGHQHWNMFPIQRPPLVMKGRRIMCGKCEKG*VSDSVTGG 
RAVAG E Q AS QRR T VFTAGGGECLGAKS VRAS VFTGNQ PGVMG LL 
NGKRGGCFESGYLFGFIVIGKIQSLEAKVPLPVNGQTGERASPG 
NCRIHI VDAVC * SEHH*DHFLAAAFLENSTI IS + VAPGSWQDHA 
VLQK^QASVRCTIGFESVDTAPAGFWAHSPPGLOGEPTTTSVSL 
FVLAPQDGEGVPFVEGQLVTVLGLWPQS IRHTFVHHTQLFLHP 
I * KLGALDVAFLHLLTLVCS S FNVAYG*GKNGGTTLHQLFAEVN 
AVTRGSAVQRRPS IT I S3 IHVDTKIQQELKDVMVAGADGWQWG 
DP F WGLAG I FHL I DD PLHQ I ELS FQRR V * EQCQG VKP DS Q P VP 
RPLRVGLLQVGPLVRGGGRRVAGRGKRCWRDLLFPWRWGLSHRT 
RDLLRGGDRGHVWI VLCRLGSLVGGLGTDE LLWFGGR * LI 1 1 G 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptid~ 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N^Asparagine, 
P«Proline, Q=Glutamlne, R=Arginine, 
S=Serine, T»Threonine, V= Valine, 
W=Tryptophan, Y-Tyrosine , X=Unknown, * =Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








I * * RGRLSGEWGCGLGRGELFQVS IGIGVSIVHIGQGDHEVLGG " 
AGL VERGALHATGQGVE ALVQQLLD VGPAGAbGLCDGAALFQG P 
GRVGQLPAEGLQVCITLVAQWRMHDGRELGGAEWPWQALHGAAI 
CG VGGAI LLKALS Q YFL KGG * RL WCARGQ * P VKKRQRRWRG * TR 
R * NGLTIHCFN* L I * GAVC CRLV I LRWCGLLE VHGVYGT * r HCL 
GS FPGRLWP * PFI SQERPNGHCQWEFRLAVPS WKCRWSRWRVRG 
TWRYGNPLLNLL * GAWLGGAACGGQQGGPLSTWQACTGPGOAAF 
LPPFQGACRPRTQRCRTWVCPIAWRQLLAYTRD 


6010 


1 


3533 


IMPCGSSRLLRGCWTHPNEPVSDLSYFDCIESVMENSKVLGESM 
AG I S QNAKTGDLPAFGECVG I AS KALCGLTEAAAQAAYLVG I FD 
PNSQAGHQGLVD P IQFARANQAI QMACQNLVDPGSSPSQVLSAA 
TIVAKHTSALCNACRI ASS KTANPVAKRHFVQSAKEVANS TANL 
VKTI KALDGDFSEDNRNKCRI ATAPLI EAVENLTAFASNPEFVS 
IPAQISSEGSQAQEPILVSAKPMLESSSYLIRTARSLAINPKDP 
PTWSVLAGHSHTVSDS IKSLITS IRDKAPGQRECDYSIDGINRC 
IRDI EQASLAAVSQS LATRDDI S VEALQEQLTS WQEIGHL r DP 
IATAARGEAAQLGHKGTQLASYFBPLILAAVGVASKILDHQQQM 
TVLDQTKITiAESALC^LYAAKEGGGNPKAQHTHDAITEAAQLMK 
EAVDD I MVTLNE AAS E VGLVGGMVDA I AEAMS KL»DEGT P PE P KG 
TFVD YQTTWKYS KAIAVTAQEMMTKS VTNPEELGGLAS QMTSD 
YGHLAFQGQMAAATAEPEEIGFQIRTRVQDLGHGCIFLVQKAG\ 
ALQVCPTDS YT KRE L I E CARAVTE KVS L VLS ALQAGNKGTQ ACI 
TAATAVSGI I ADLDTT I M FAT AG T LNAENS ET FAD HREN ILKTA 
KAL VE DTKLLVSGAAS T P D KLAQ AAQ S S AAT I TQLAEWKLGAA 
SLGSDDPETQWL INAI KDVAKALS DL ISATKGAAS KPVDDPSM 
YQLKGAAiCVMVTNVTSLLKTVKAVEDEATRGTRALEAT IECIKQ 
ELTVFQS KDVPEKTS S PEES I RMTKG ITMATAKAVAAGNS CRQE 
DVIATANLSRKAVSDMLTACKQASFHPDVSDEVRTRALRFGTEC 
TLGYLDLLEHVLVI LQKPT PELKQQLAAFSKRVAGAVTEL IQAA 
EAMKGTE WVDPEDP TVI AETE LLGAAAS I EAAAKKLEQLKPRAK 
P KQADETLDFEEQ I LEAAKS I AAATS ALVKSASAAQRELVAQGK 
VGS I PANAADDGQWSQGLI SAARMVAAATSSLCEAANASVQGHA 
SEEKLIS S AKQVAAS TAQLLVACKVKADQDS EAMRRLQAAGNAV 
KRASDNLVRAAQKAAFGKADDDDVWKTKFVGG I AQ 1 1 AAQEEM 
L KKE RELE E ARKKLAQ I RQQQ YKFLPTE LREDEG 


6011 


446 


1835 


LLQPAMRKS PGLSDCLWAW I LLLSTLTGRS YGQPS LQDELKDNT 
TVFTR ILDRLLDG YDNRLRPGLGERVTE VKTDI FVTS FGP VSDH 
DMEYT I DVFFRQS WKDERLKFKGPMTVLRLNNLMAS KI WTPDTF 
FHNGKKSVAHNMTMPNKLLRITEDGTLLYTNIRLTVR\AECPMAF 
GRDFPM\D\AHACPLKFGSYAYTRAEWYEWTREPARSVWAED 
GSRIiNQYDLLGQTVDSGIVQSSTGEYVVMTTHFHLKRKIGYFVI 
QTYLPCIMTVILSQVSFWLNRESVPARTVFGVTTVLTMTTLS IS 
ARNSL P KVA YATAMD W FIAVCYAFVFS AL I EFAT VNYFTKRG YA 
WDGKS WP E KPKKVKD P L I KKNNTYAPTATS YT PNLARGDPGLA 
TIAKSATIEPKEVKPETKPPEPKKTFNSVSKIDRLSRIAFPLLF 
G I FNLVYWATYLNRE PQLKAPTPHQ 


6012 


351 


5013 


PAELFQS FAIWHKELYDWRLGPWNQCQPVISKSLEKPLECIKGE 
EG I Q VRE IACI QKDKD I PAEDI ICEYFEPKPLLEQACL I PCQQD 
CI VS EFS AWS ECS KTCGSGLQHRTRHWAPPQFGGSGCPNLTEF 
Q VCQSS PCEAEELRYS LHVGPWSTCSMPHSRQVRQARRRGKNKE 

EVMCINKTGKAADLSFCQQEKLPMTPQSCVITKECQVSEWSEWS 
PCSKTCHDMVSPAGTRVRTRTIRQFPIGSEKECPEFEEKEPCLS 
QGDGWP CATYGWRTTE WTECR VDP LLS QQDKRRGNQTALCGGG 
IQTREVYCVQANENLLSQl^THKNKEASKPMDLKLCTGPIPNTT 
QLCH I PCPTECE VS PWS AWGPCT YENCNDQQGKKGFKLRKRRIT 
NEPTGGSGVTGNCPHLLEAIPCEEPACYDWKAVRLGDCEPDNGK 
ECGPGTQVQEWCINSDGEEVDRQLCRDAIFP I PVACDAPCPKD 
CVLS TWSTWS S CSHTCSGKTTEGKQI RARS I LAYAGEEGG I RCP 
NS S ALQE VRS CNEHPCTVYHWQTG P WGQCI E D TS VS S FNTTTTW 
NGEASCSVGMQTRKVI CVRVNVGQVGPKKCPESLRPETVRPCLL | 
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ID 
NO: 


Predicted 
beginning 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E« 
Glutamic Acid, F« Phenyl alanine , G=Glycine, 
H«Histidine, I«Iaoleucine, K=Lysine, ■ 
L=Leucine, M=Methionine, N«Asparagine , 
PeProline, Q=Glutamine, R-Arginine, 
S=Serine, T=Threonine, v=Valine, 
W=Tryptophan / Y*Tyrosine, X=UnJcnown, *=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








PCKKDCIVTPYSDWTSCPS\SCKEGDSSIRKQSRHRVIIQLPAN 
GGRDCTDPLYEEKACEAPQACQSYRW\KTHKW\HRCQ\LVP\WS 
VQQDSPNGAQEGCGPGRQARAITCRKQDGGQAGIHECLQYAGPV 
PALTQACQIPCQDDCQLTSWSKFSSCNGDCGAVRTRKRTLVGKS 
KKKEKCKNSHLYPLIETQYCPCDKYNAQPVGNWSDCILPEGKVE 
VLLGMKVQGDIKECGQGYRYQAMACYDQNGRLVETSRCNSHGYI 
EEACIIPCPSDCKLSEWSNWSRCSKSCGSGVKVRSKWLREKPYN 
GGRPCPKLDHVNQAQVYEVVPCHSDCNQYLWVTEPWSICKVTFV 
NMRENCGEGVQTRKVRCMQNTADGPSEHVEDYLCDPEEMPW3SR 
VCKLPCPEDCVISEWGPWTQCVLPCNQSSFRQRSADPIRQPADE 
GRSCPNAVEK^PCNLNKNCYHYDYNVTDWSTCQLSEKAVCGNGI 
KTRMLDC^SIX5KSVDLKYCEALGLEKNWQMNTSCMVECPVNCQ 
LSDWSPWSECSQTCGLTGKMIRRRTVTQPFQGDGRPCPSLMDQS 
KPCPVKPCYRWQYGQWSPCQVQEAQCGEGTRTRNISCWSDGSA 
DDFSKWDBEFCADIELIIDGNKNMVLEESCSQPCPGDCYLKDW 
S S WS LCQLTC VNGED LG FGGI Q VR S R P V 1 1 Q E LENQHL C PE QML 
ETKS C YDGQC YEYKWMAS AWKGS SRT VWCQRS DGINVTGGCLVM 
SQPDADRSCNPPCSQPHSYCSETKTCHCEEGYTEVMSSNSTLEQ 
CTL I P VWLPTMEDKRGDVKTS RAVHPTQPS SNPAGRGRTWFLQ 
PFGPDGRLKTWVYGVAAGAFVLLI F I VSMI YLACKKPKKPQRRQ 
NNRLKPLTLAYDGDADM 


6013 


1161 


710 


GAFIAGVPVQPVLIRYPNSLDTTSWAWRGPGVLKVLWLTASQPC 
S I VDVE FLPVYH PSPEES RDPTL YANNVQRVMAQALG I PATECE 
FVGSLPVIWGRLKVALEPQL/WGTGKSASEGWAVRWLCGRWGR 
AR PES NDQ PGR VCQAATAL 


6014 


2857 


613 


EAVAGGMEK5RMNLPKGPDTLCFDKDEFMKEDFDVDHFVSDCRK 
RVQLEELRDDLELYYKLLKTAMVELINKDYADF\VNLSTNLVGM 
DKALNQLS VP LGQLRE EVLS LRS S VS EGIRAVDERMS KQEDI RK 
KKMCVLRLIQVIRSVEKIEKILNSQSSKETSALEASSPLLTGQI 
LERIATEFNQLQFHACQSK\GMPLLDKVRPRIAG1TAMLQQSLE 
GLLLEGLQTS DVD 1 1 RHCLRTYAT I DKTRDAEALVGQVLVXP YI 
DEVI I E QF VE S H? NGLQVM YNKL LE FV PHHCRLLREVTGG A I SS 
EKGNTVPG YDFL VNS VW PQ I VQGLEEKLPS LFNPGNPDAFHEKY 
TISMDFVRRLERQCGSQASVKRLRAHPAYHSFNKKWNLPVYFQI 
RFRE I AGSLE AALTDVLEDAPAES P YCLLASHRTWS SLRRCWSD 
EMFLPLLVHRLWRLHSGRFWARYSVFV\N\ELSLRPISNESPKE 
IKKPLVTGSKEPS ITQGNTEDQGSG P SETKP WS ISRTQLVYW 
ADL DKLQEQL P ELLE I 1 KP KLEM IGF KNFS S I S AALEDS Q S S FS 
ACVPSLSSKI I QD LSD S C FG FLKS ALE VPRL YRRTNKE VPTTAS 
SYVDSALKPLFQLQSGHKDKLKQAIIQQWLEGTLSESTHKYYET 
VSD VLNS VKKME E S LKRLKQARKTT P ANP VG P SGGMS DDD K I RL 
QLALD VE YLGEQ I QKLGLQASDI KS FSALAELVAAAKDQATAEQ 
P 


6015 


13 


2237 


AEGCAERRGTEPWELSMSWESGAGPGLGSQGMDLVWSAWYGKC 
^GKGSLPLSAHGIWAWLSRAEWDQVTVYLFCDDHKLQRYALN 
RITVWRS RSGNELPLAVAS TADL I RCKLLDVTGGLGTDELRLLY 
GMALVRFVNLISERKTKFAKVPLKCIAQEVNIPDWIVDLRHELT 
HKKMPHINDCRRGCYFVLDWLQKTYWCRQLENSLRETWELEEFR 
EGIEEEDQEEDKNIWDDITEQKPEPQDDGKSTESDVKADGDSK 
GSEEVDSHCKKALSHKELYERARELLVSYEEEQFTVLEKFRYLP 
KAI KAWNNPSPRVECVLAELKGVTCENREAVLDAFLDDGFLVPT 
FEQLAALQIEYEENVDLNDVLVPKPFSQFWQPLLRGLHSQNFTQ 
ALLERMLSELPALGISGIRPTYILRWTVELIVANTKTGRNARRF 
SAGQWEARRGWRLFNCSASLDWPRMVESCLGSPCWASPQLLRII 
F \ KAMGQGLODE \EQEKLLR I CS I YTQSGENS L VQEGS EAS PIG 
KS P YTLD S LY WS VXP ASS S FG S EAKAQQQEEQG S VNDVKE E E K E 
EKEVLPDQVEEEEENDDQEEEEEDEDDEDDEEEDRMEVGPFSTG 
Q ES P T AEN ARLLAQKRGALQG S AWQ VS S ED VR WDTFP \ LG RM PR 
SRPRTPAELMLENYDTHVI FWTKPVL \ EQRLE PSTCK\TDTLG L 
\SCGVGS\GNCSNSSSSNFRGAFLLBARGSLlH\GL\KTGLQLF 


6016- 


13 


2237 


AEGCAERRGTEPWELSMSWESGAGPGLGSOGMDLVWSAWYGKC 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine, G-Glycine, 
H=Histidine, I=Isoleucine , K-Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P-Proline, Q-Glutamine, R=Arginine, 
S-Serine, T=Threonine , V=Valine, 
W-Tryptophan, Y=Tyrosine, X=UnJcnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VKGKGS LPLS AHG I WAWLSRAE WDQVTVYLFCDDHKLQRYALN 
R ITVWRS RSGNELPLAVASTADLIRCKLLDVTGGLGTDELRLLY 
GMALVRFVNL IS ERKTKFAKVPLKCLAQE VNI PDW I VDLRHELT 
HKKM PH I NDCRRGC YFVLD WLQ KT YWCRQLENS LRE TWE LE E FR 
EGIEEEDQEEDKNIWDDITEQKPEPQDDGKSTESDVKADGDSK 
GSEEVDSHCKKALSHKELYERARELLVSYEEEQFTVLEKFRYLP 
KAI KAWNNPS PRVECVLAELKGVTCENREAVLDAFLDDGFLVPT 
FE QLAALQ I E YEENVDLNDVL VP KP FSQ FWQ PLLRGLHSQNFTQ 
ALLERMLS EL PALG I SGI RPTY I LRWTVELI VANTKTGRNARRF 
SAGQWEARRGWRLFNCSASLDWPRMVESCLGSPCWASPQLLRII 
F \ KAMGQG LQD E \ EQE KLLR I CS I YTQS GENS LVQ EGS EAS PIG 
KSPYTLDSLYWSVKPASSSFGSEAKAQQQEEQGSVNDVKEEEKE 
E KEVLPDQ VE E EE ENDDQEEE E EDEDDE DDE E E DR ME VG PFS TG 
QES PTAENARLLAQKRGAIjQGS AWQVSS EDVRWDTFP \LGRMPR 
SRPRTPAELMLENYDTHVIFWTKPVL\EQRLEPSTCK\TDTLGL 
\SCGVGS\GNCSNSSSSNFRGAFLLEARGSLH\GL\KTGLQLF 


6017 


203 


3469 


SHQE IEQNSAMAPRKRGGRGIS FI FCCFRNNDHPEITYRLRNDS 
N F ALQTM E PAL P MP P VE ELDVM FS EL VDE LDLTDKHREAM FALP 
AEKKWQIYCSKKKDQEENKGATSWPEFYIDQLNSMAARKSLLAL 
EKEEEEERSKTIESLKTALRTKPMRFVTRFIDLDGLSCILNFLK 
TMDYETSES R I HTSL IGCI KALMNNSQGRAHVLAHS ES INVI AQ 
S LS T EN I KT KVAVLE I LGAVCL VPGGHKKVLQAM LHYQ KYASER 
TR FQTLINDLDKSTGRYRDE VS LKTAI MS F INAVLSQGAGVESL 
DFRLHLRYE \ FLMLGIHPVMDKLRKHENSTLDRHLDFFEMLRNE 
DELEFAKRFELVHIDTKSATQMFELTRKRLTHSEAYPHFMSILH 
HCLQMPYKRSGNTVQYWLLLDRIIQQIVIQNDKGQDPDSTPLEN 
FN I KNVVRML VNENEVKQWKEQAE KMRKEHNE LQQ KLE KKEREC 
DAKTQEKEEMMQTLNKMKEKLEKBTTEHKQVKQQVADLTAQLHE 
LSRRAVCAS I PGGPSPGAPGGPFPS S VPGS LLPP P PP P PLPGGM 
LPPPPPPLPPGGPPPPPGPPPLGAIMPPPGAPMGLALKKKSIPQ 
PTNALKS FNW S KLPENKLEGTVWTE I DDTKV F K I LDLEDLERTF 
SAYQRQQDFFVNSNSKQKEADAIDDTLSSKLKVKELSVIDGRRA 
QNCNI LLSRLKLSNDE I KRAI LTMDEQEDLPKDMLEQLLKFVPE 
KSD I DLLEEHKHELDRMAKADRFLFEMS RINHYQQRLQS LYFKK 
KFAERVAEVKPKVEAIRSGSEEVFRSGALKOLLEWLAFGNYMN 
KGQRGNAYG F K I S S LNKI ADTKSS I D KN I TLLKYL I T I VEN KY P 
SVLNLNEELRDIPQAAKVNMTELDKEISTLRSGLKAVETELEYQ 
KS Q P PQ PGDKF VS WSQ F I T VAS FS F S D VEDLLAE AKDLFTKAV 
KHFGEEAGKIQPDEFFGIFDQFLQAVSEAKQENENMRKKKEEEE 
RRARMEAQLKEQRERERKMRKAKENS E E S GE FDDLVS ALRSGEV 
FDKDLS KLKRNRKRITNQMTDS SRERP I TKLNF 


6018 


13 


2510 


TISQSGG IRRRREAVWFEWNMDFSRLHMYS P PQCVPENTGYTY 
ALS S S YS S DALD FETEHKLD P VFDS PRMS RRS LRLATTACTLGD 
GEAVGADSGTSSAVSLKNRAARTTKQRRS TNKSAFS INHVSRQV 
TS SGVS YGGTVSLQDAVTRRPP VLDES W I REQTTVDHFWGLDDD 
GD LKGGNKAA I QGNGDVGAG AATGHNG F F CS NCNMLSERKDVLT 
AHPAAPGPVSRVYSRDRNQKCDDCKGKRHLDAHPGRAGTLWHIW 
ACAG Y FLLQ ILRR IGAVGOAVS RTAW S ALW LAWAPG KAASG VF 
WWLGIGWYQFVTLISWLNVFLLTRCLRNICKFLVLLIPLFLLLG 
L S LRGOG \ N FF S FLPVLNWAS MHRTQRVDD PQDVFKPTTSRLKQ 

LQARVDQMEGGAAGPSASVRDAVGQPPRETDFMAFHQEHEVRMS 
HLED I LGKLREKSEAIQKELEQTKOKTI SAVGEQLLPTVEHLQL 
ELDQLKSELSSWRHVKTGCETVDAVQERVDVQVREMVKLLFSED 
QQGGSLEQLLQRFSSQFVSKGDLQTMLRDLQLQILRNVTHHVSV 
TKQLPTS EAWSAVS EAG ASG I TEAQARAI VNSALKLYSQDKTG 
M VD FALESGGGS ILSTRCSETYETKTALMS LFGI PLWYFSQS PR 
WIQPDI YPGNCWAFKGSQGYLWRLSMM I HPAAFTLEH I PKTL 
S PTGN I S S APKDFAVYGLENE YQEEGQLLGQ FTYDQDGESLQMF 
QALKRPDDTAFQIVELRIFSNWGHPEYTCLYRFRVHGEPVK 


6019 


2 


1066 


TPNDREPPPQRPPSSRRASHLAQEITSAASLGDQTQILGSLTTA 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I«Ieoleucine, K=Lysine, 
L-Leucine, M*Methionine, N=Asparagine, 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
WsTryptophan, Y=Tyrosine, X=Unknown, *=S top 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








PVITSAIRSMPGISSQILTNAQGQVIGTLPMWNSASVAAPAPA " 
QSLQVQAVTPQLLLNAQGQVI ATLAS SPLPPPVAVRK\PSTPES 
LLKSEVQP I KP TPTVPQPAWI AS PAPAAKPSAS AP IPITCSET 
PTVSQLVSKPHTPSLDEDGI NLEEIRE FAKNFKI RRLS LGLTQT 
QVGQALTATEGPAYSQSAICRFEKLDITPKSAQKLKPVLEKWLN 
EAELRNQEGQQNLMEFVGGEPSKKRKRRTS FTPQAI EALNAY FE 
KNPLPTGQE ITE I AKELNYDREWRVWFCNRRQTLKNTS KLKVF 
QIP 


6020 


4953 


549 


EAIQFEVSIGNYGNKFDTTCKPIiASTTQYSRAVFDGNYYYYLPW 
AHTKP WTLTS YWED I S HRLDAVNTLLAMAERLQTN I EALKS G I 
QGK I PANQLAELWLKL I DEVI EDTRYTLPLTEGKANVTVLDTQ I 
RKLRSRSLSQIHEAAVRMRSEATDVKSTLAE I EDWLDKLMQLTE 
E PQNSMPD IIIWMI RGE KRLAYAR I PAHQVLYS TSGENASGKYC 
GKTQTIFLKYPQEKNNGPKVPVELRVNIWLGLSAVEKKFNSFAE 
GTFTVFAEMYENQALMFGKWGTSGLVGRHKFSDVTGKIKLKREF 
FL P PKGWE WEGEW I VD PERS LLTEADAGHTE FTDEVYQNESR YP 
GGDWKPAEDTYTDANGDKAASPSELTCPPGWEWEDDAWSYDINR 
AVD E KGWE YG I T I P PDHKPKS WVAAEKM YHTHRRRR LVR KRKKD 
LTQTASSTAGAMEELQDQEGWBYASLIGWKFHWKQRSSDTFRRR 
RWRRKMAPSETHGAAAIFKLEGALGADTTEDGDEKSLEKQKHSA 
TTVFGANTPIVSCNFDRDYIYHLRCYVYQARNLLALDKDSFSDP 
YAHICFLHRSKTTEI IHSTLNPTWDQTI IFDEVEIYGEPQTVLQ 
N P PKV I ME L FDNDQ VGKDE FLGRS I FS P WKLNS EMD I T PKLLW 
HPVMNGDKACGDVLVTAEIilLRGKDGSNLPILPPQRAPNLYMVP 
QG IRPWQLTAI EILAWGLRNMKNFQMASI TSPSLWECGGERV 
ESWIKNLKKTPNFPSSVLFMKVFLPKEELYMPPLVIKVIDHRQ 
FGRKPWGQCTIERLDRFRCDPYAGKEDIVPQLKASLLSAPPCR 
D I VI EMEDTKPLIiAS KCLS SMSTALS KMAS PAT VHLTEKEEE I V 
DWWS KFYAS SGEHEKCGQYIQKGYSKLKIYNCELENVAE FEGLT 
DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPDDPSVPAPP 
RQFRELPDSVPQECTVRIYIVRGLELQPQDNNGLCDPYIKITLG 
KKVI E \DRDHYI PNTLNPVFGRMYFT.^rYT.PnPirnT .V T c wnvn 

TFTRDEKVGETI IDLENPF\LSRFG\SHCG\ IPEEYCVSGVNTW 
RDSLR \PTQ\LLQNVARFKGFPQP 1 LSEDGSRIRYGGRDYS LDE 
FEANKI LHQHLGAPEERLALHI LRTQGLVPEHVETRTLHS TFQP 
NIS\RYYLRVIIWNTKDVILDEKSITGEEMSDIYVKGWIPGNEE 
NKQKTDVHYRS LDGEGKFNWRFVFP FDYLPAEQLC I VAKKEHFW 
SIDQTEFRIPPR\LIIQIW\DNDKFS\LDDYLGFPRTLTCRHTI 
HFLQKS PGGNC/RGLDM I PDLKAMNP LKAKTASL FEQKSMKG WW 
PCYAEKDGARVMAGKVEMTLEILNEKEADERPAGKGRDEPNMNP 
KLDLPNRPETSFLWFTNPCKTMKFIVWRRFKWVIIGLLFLLILL 
LFVAVLLYSLPNYLSMKIVKPNV 


6021 


4953 


549 


EAIQFEVS I GNYGNKFDTTCKPLASTTQYS RAVFDGNY YYYLP W 
AHTKP WTLTS YWED I SHRLDAVNTLLAMAERLQTN I EALKSG I 
QGK I PANQLAELWLKLIDBVI EDTRYTLPLTEGKANVTVLDTQ I 
RKLRSRSLSQIHEAAVRMRSEATDVKSTLAEIEDWLDKLMQLTE 
EPQNSMPDI I IWMIRGEKRLAYARIPAHQVLYSTSGENASGKYC 
G KTQT I FL K YPQ EKNNG PKVP VE LRVN I WLGLS AVE KKFNS FAE 
GTFTVFAEMYENQALMFGKWGTSGLVGRHKFSDVTGKIKLKREF 
FLPPKGWEWEGEWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 
G G DWKP AE DT YTDANGDKAAS P S ELTCP PG WE WEDDA W S YD INR 
AVDEKGWEYGITIPPDHKPKS WVAAEKM YHTHRRRRLVR KRKKD 
L TQTAS S TAG AME E LQDQEGWE YAS L IG W KFHWKQRS S D TFRRR 
RWRRKMAPSETHGAAAI FKLEGALGADTTEDGDEKSLEKQKHSA 
TT VFGANTP I VS CN FDRDY I YHLR CYVY QARNLLALDKDS FSD P 
YAH I C FLHRS KTTE 1 1 HS TLNP TWDQTI I FD EVE I YGE PQTVLQ 
NPPKVIMELFDNDQVGKDEFLGRS I FSPWKLNSEMDITPKLLW 
HPVMNGDKACGDVLVTAELILRGKDGSNLPILPPQRAPNLYMVP 
QG I RPWQLTAIE I LAWGLRNMKNFQMAS ITS PSLWECGGERV 
ESWIKNLKKTPNFPSSVLFMKVFLPKEELYMPPLVIKVIDHRQ 
FGRKPWGOCTIERLDRFRCDPYAGKEDIVPQLKASLLSAPPCR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine / OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M*Methionine, N=Asparagine , 
PsProline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine , V«Valine, 
W=Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








D I VI EMEDTKPLLAS KCLSSMSTALSKMAS PATVHLTEKEEEI V 
OWWS KFYAS S GEHE KCGQ Y I Q KG YS KLKI YNCELENVAEFEGLT 
DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPDDPSVPAPP 
RQFRELPDS VPQECTVRI YIVRGLELQPQDNNGLCDP Y I KITLG 
KKVI E\DRDHYI PNTLNPVFGRMYELSCYLPQEKDLKI SVYDYD 
TFTRDEKVGETI IDLENPF\LSRFG\SHCG\ I PEEYCVSGVNTW 
RDSLR\PTQ \ LLQNVARFKGFPQP I LSEDGSRI R YGGRD YSLDE 
F EANKILHQHLGAPE ERLALH I LRTQGL VPEHVETRTLH S TFQ P 
NIS\RYYI^VI IWNTKDVILDEKS ITGEEMSDIYVKGWIPGNEE 
NKQKTDVHYRSLDGEGNFNWRFVFPFDYLPAEQIiCIVAXKEHFW 
S IDQTEFRI PPR\LI IQIW\DNDKFS\LDDYLGFPRTLTCRHTI 
HFLQKS PGGNC/ RGLDMI PDLKAMNPLKAKTASLFEQKS MKGWW 
P C YAE KDG AR VMAG KVE MTLE I LNE KEADE RP AGKGRDE PNMNP 
KLDLPNRPETSFLWFTNPCKTMKFIVWRRFKWVIIGLLFLLILL 
LFVAVLLYSLPNYLSMKIVKPNV 


6022 


4953 


549 


EAI QFE VS IGNYGNKFDTTCKPLAS TTQYS RAVFDGN YYYY LPW 
AHTKP WTLTS YWED ISHRLDAVNTLLAMAERLQTNIEALKSGI 
QGKIPANQLAELWLKLIDEVIEDTRYTLPLTEGKANVTVLDTQI 
R KLRS RSLS Q I HEAAVRMRS EATD VKS TLAE I EDWLDKLMQLTE 
EPQNSMPD I I I WMIRGEKRLAYARI PAHQVLYSTSGENASGKYC 
GKTQTIFLKYPQEKNNGPKVPVELRVNIWLGLSAVEKKFNSFAE 
GTFTVFAEMYENQALMFGKWGTSGLVGRHKFSDVTGKIKLKREF 
FLPPKGWEWEGEWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 
GGDWKPAEDTYTDANGDKAASPSELTCP PGWE WEDDAWS YD INR 
AVD E KGWE YG I T I P PDHKPKS WVAAE KM YHTHRRRRL VRKR KKD 
LTQTASSTAGAMEELQDQEGWEYASLIGWKFHWKQRSSDTFRRR 
RWRRKMAPSETHGAAAI FKLEGALGADTTEDGDEKSLEKQKHSA 
TTVFGANTP I VS CNFDRD YI YHLR C YVYQ ARNL LALDKDS FS D P 
YAHICFLHRSKTTEIIHSTLNPTWDQTIIFDEVEIYGEPQTVLQ 
NPPKVIMELFDNDQVGKDEFIiGRS I FS P WKLNS EMDITP KLLW 
H PVMNGDKACX3DVLVTAEL I LRGKDGSNLPI L P PQRAPNL YMVP 
QGIRPWQLTAIE ILAWGLRNMKNFQMAS I TSPSLWECGGERV 
ESWIKNLKKTPNFPSSVLFMKVFLPKEELYMPPLVIKVIDHRQ 
FGRKPWGQCTIERLDRFRCDPYAGKEDIVPQLKASLLSAPPCR 
D I VI EMEDTKPLLAS KCLS SMSTALS KMAS PATVHLTEKEEE I V 
DWWSKFYASSGEHEK03QY I QKGYSKLKI YNCELENVAEFEGLT 
DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPDDPSVPAPP 
RQFRELPDSVPQECTVRIYIVRGLELQPQDNNGLCDPYIKITLG 
KKVI E\DRDHY I PNTLNPVFGRMYELSCYLPQEKDLKI SVYDYD 
TFTRDEKVGETI IDLENPF\LSRFG\SHCG\I PEEYCVSGVNTW 
RDSLR \ PTQ\LLQNVARFKGFPQPILS EDGSRIRYGGRDYSLDE 
FEANKILHQHLGAPEERLALHILRTQGLVPEHVETRTLHSTFQP 
NIS\RYYLRVI IWNTKDVILDEKS ITGEEMSDIYVKGWIPGNEE 
NKQKTDVH YRSLDGEGNFNWRFVFPFD YLPAEQLC I VAKKEHFW 
S I DQTE FRIPPR\LIIQIW\ DNDKFS \ LDD YLGF P RTLTCRKT I 
HFLQKS PGGNC/RGLDMIPDLKAMNPLKAKTASLFEQKSMKGWW 
P C YAE KDGAR VMAG KVEMTLE I LNE KEAD E RP AG KGRDE PNMN P 
KLDLPNRPETSFLWFTNPCKTMKFIVWRRFKWVIIGLLFLLILL 
L FVAVLLYS LPNYLSMKI VKPNV 


6023 


102 


916 


S QE LGM FVELNNLLNTTPDRAEQGKLTLL CDAKTDG S FLVHHFL 
S FYLKANCKVCFVAL IQSFSHYSIVGQKLGVSLTMARERGQLVF 
LEGL/ IVCSGR\VFQAQ KEPHPLQFLREANAGNLKPLFEFVREA 
LKPVDSGEARWTYPVLLVDDLSVLLSLGMGAVAVLDFIHYCRAT 
VCWELKGNMWLVHDSG D AED EEND I LLNG LS HQSHL I LRAEG L 
ATGFCRDVHGQLRILWRRPSQPAVHRDQSFTYQYKIQDKSVSFF 
AKGMSPAVL 


6024 


3 


3260 


FLS FLCY P RFRCL F CLQ F AI PAS RMEQLNELE L LME KS FWEEAB 
LPAELFQKKWASFPRTVLSTGMDNRYLVLAVNTVQNKEGNCEK 
RLVITASQSLENKELCILRNDWCSVPVEPGDIIHLEGDCTSDTW 
IIDKDFGYLILYPDMLISGTSIASSIRCMRRAVLSETFRSSDPA 
TRQML IGTVLHEVFQKAINNSFAPE KLQELAFQTIQE I RHLKEM 
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NO: 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine, 
P=Proline, Q.Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W tryptophan, Y=Tyrosine, X- Unknown, *=Stop 
Codon, /=poesible nucleotide deletion, 
\-possible nucleotide insertion) 








YRLNLSQDEIKQEVEDYLPSFCKWAGDFMHKNTSTDFPQMQLSL 
P SDNS KDNS TCN I E WKP MDIEESIWSPR FGLKG K I DVTVG VK I 
HRGYKTKYKIMPLELKTGKESNSIEHRSQWLYTLLSQERRADP 
EAG LLLYL KTGQM Y P VPANHLDKRE LLKLRNQMAF S LFHR I S KS 
ATRQKTQLASLPQIIEEEKTCKYCSQIGNCALYSRAVEQQMDCS 
SVPIVMLPKIEEETQHLKQTHLEYFSLWCLMLTLESQSKDNKKN 
HQN I W LMP AS EME KS GS C I GNL I RMEHVK I VCDGQ YLHNFQ CKH 
GAIPVTNLMAGDRVIVSGEERSLFALSRGYVKEINMTTVTCLLD 
RNLS VL PES TLFRLDQ EE KNCD I DT P LGNLS KLMENT F VS KKLR 
DL 1 1 DFRE PQ FI S YLS SVLPHDAKDTVAC I LKGLNKPQRQAMKK 
VLLSKDYTLIVGMPGTGKTTTICTLVRILYACGFSVLLTSYTHS 
A VDN I LLKLAXFKI G FLRS R \Q I Q KVHPA I QQ FTEHE I CRS KS I 
KS\LALLEELYrSQLIDATTCMGINHPIFSRKIFDFCIVDEASQ 
I SQP I CLGPLFFSRRFVLVGDHQQLP PLVLNREARALGMS ESLF 
KRLEQNKSAWQLTVQYRMNSKIMSLSNKLTYEGKLECGSDKVA 
NAVINLRHFKDVKLELEFYADYSDNPWLMGVFEPNNPVCFLNTD 
KVPAPEQVEKGG VSNVTEAKLI VFLTS 1 FVKAGCS PSDIGI I AP 
YRQQLKI INDLLARS IGMVEVNTVDKYQD\RDKS I VLVSFVRSN 
KDGTVGELLKDWRRLNVAITRAKHKLILLGCVPSLNCYPPLEKL 
LNHLNSEKLI IDLPSREHESLCHILGDFQRE 


6025 


3977 


89 


GGFPAQSDHLPPVFPLRSDLLITMSTLYVSPHPDAFPSLRALIA 
ARYGEAGEGPGWGGAHPRICLQPPPTSRTSFPPPRLPALEQGPG 
GLWVWGATAVAQLLWPAGLGGPGGSRAAVLVQQWVS YADTEL I P 
AACGATLPALGLRSSAQDPQAVLGALGRALSPLEEWLRLHTYLA 
GEAPTLADLAAVTALLLPFRYVLDPPARRIWNNVTRWFVTCVRQ 
PE FRAVLGE WL YSGAR PLS HQ PG P E APALP KTAAQ LKKEAKKR 
EKLEKFQQKQKIQQQQPPPGEKKPKPEKREKRDPGVITYDLPTP 
PGEKKDVSGPMPDSYSPRYVEAAWYPWWEQQGFFKPEYGRPNVS 
AANPRG VFMMC I P P PNVTGS LHLGHALTNA IQDS LTR WHRMRGE 
TTLWNPGCDHAGIATQVVVEKKXWREQGLSRHQLGREAFLQEVW 
KWKEEKGDRIYHQLKKLGSSLDWDRACFTMDPKLSAAVTEAFVR 
LHEEG 1 1 YRSTRLVNWS CTLNS AI S D I E VDKKELTGRTLLS VPG 
YKEKVEFGVLVS FAYKVQGSDS DEE VWATTRI ETMLGDVAVAV 
HPKDTRYQHLKGKNVIHPFLSRSLPIVFDEFVDMDFGTGAVKIT 
PAHDQNDYEVGQRHGLEAISIMDSRGALINVPPPFLGLPRFEAR 
KAVLVALKERGLFRG I EDNPMWPLCNRS KDWE PLLRPQWYVR 
CGEMAQAASAAVTRGDLRI LPERHQRTWHAWMDNI RE \ WCMFPG 
KLWWG \HR\I PAYFVTVSDPAVPPGEDPDGRYWVSGRNEAEARE 
KAAKEFGVS PDKI S LQQDEDVLDTWFSSGLFPLS I LGWPNQS ED 
LSVFYPGTLLETGHDILFFWVARMVMLGLKLTGRLPFREVYLHA 
I VRDAHGRKMSKS LGNVI DPLD VI YG I S LQGLHNQLLNS NLD P S 
E VEKAKEGQKADFPAGI PECGTDALRFGLCAYMSQGRDI NLDVN 
R I LGYRHFCNKLWNATKFAIjRG lgkgfvp s PTSQ PGGHES l VDR 
WIRSRLTEAVRLSNQGFQAYDFPAVTTAQYSFWLYELCDVYLEC 
LKP\n^GVDQVAAECARQTLYTCLDVGLRLLSPFMPFVTEELFQ 
RLPRRMPQAPPSLCVTPYPEPSECSWKDPEAEAALELALSITRA 
VRP \LRAD YNLHPESGPTC FLE VAD\EATGALASAVSG YVQGPG 
QAQVWAVAEPWGLPAP\QGCAVALASDRCS I \HLQLQG\LLDP 
ARELG\ KLQ\ AKRVEAQ \RQAQ \RLR\ERRA\ ASGNP VKVPL\E 
VQEADEAKLQQTEAELRKVDEAIALFQKML 


6026 


2674 


514 


GPITFLKKKAKMKDMPLRIHVLLGLAITTLVQAVDKKVDCPRLC 
TCEIRPWFTPRSIYMEASrVDCNDLGLLTFPARLPANTQrLLLQ 
TNNIAKIEYSTDFPVm J TGLDLSQ^INr^SVTNrNGKKMPQLLSV 
YLEENKLTELPEKCLSELSNLQELYINHNLLSTISPGAFIGLHN 
LLRLHLNSNRLQM INS KWFDALPNLE ILM IGENPI I R I KDMNFK 
PL INLRSLVIAG INLTE I PDNALVGLENLES I S FYDNRL I KVPH 
VALQ KWNL KFLDLNKNP I NR I RRGD FSNMLHLKELG INNMPEL 
ISIDSLAVDNLPDLRKIEATNNPRLSYIHPNAFFRLPKLESLML 
NSNALSALYHGTI ESLPNLKE IS IHSNP I RCDCVI RWMNMNKTN 
IRFME PDSLFCVDPPE FQGONVRQVHFRDMME ICLPL I APES FP 
SNLNVE AGS YVS FHCRATA\ E PQPE I YW I T PSGQKLLPNT\ LTD 
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Predicted end 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S«Serine, T=Threonine , V= Valine, 
"^Tryptophan, Y.Tyrosine, X= Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\*possible nucleotide insertion) 








KFYVHSEGTLDINGVTPKEGGLYTCIATNLVGADLKSVMIKVDG 
S FPQDNNGSLNI KIRDIQANS VLVS WKASSKILKSSVKWTAFVK 
TENSHAAQSARIPSDVKVYNLTHLNPSTEYKICIDIPTIYQKNR 
KKCVNVTTKGLHPDQKEYEKNNTTTLMACI/3GLLGIIGVI CL I S 
CLSPEMNCDGGHSYVRNYLQKPTFALGELYPPLINLWEAGKEKS 
TSLKVKATVIGLPTNMS 


6027 


5254 


4148 


GGRRAPGRPGRS I KDEEEETVFREWS FS PDPL PVR YYDKDTTK 
PIS FYLS S LEELLAWKPRLEDG FNVALEPLACRQPPLS S QRPR T 
LLCHDMMGGYLDDRFIQGSWQTPYAFYHWQCIDVFVYFSHHTV 
TIP P VG WTNTAHRHG VCVLGT F I TEWNEGGRLCBAFLAGDERS Y 
QAVADRLVQ I T\RFFR FDGWL INIENSLSLAAVGNMPPFLRYLT 
TQLHRQVPGGLVLWYDSWQSGQLKWQDELNQHNRVFFDSCDGF 
FTNYNWRE EHLERMLGGAGERRADVYVG VDVFARGNVVGGRFDT 
DKVGGGFRPRASGPVPPLGPHFLMDLPFPSAPQRNDSSCSSQSG 
D P VALRNR CPAPAKL CPH 


6028 


120 


3432 


NCLbLQAKGFHGEIEDLQQWLTDTERHLLASKPLGGLPETAKEQ 
LN^MEVCAAFEAKEETYKSLMQKGQQMLARCPKSAETNIDQDI 
NNLKEKWESVETKLNER\KT\KLEEALNLA\MEFHNSL\QDFIN 
WLTQAEQTLNVASRPSLILDTVLFQIDEHKVFANEVNSHREQII 
ELDKTGTHLKYFSQKQDWLI KNLLI S VQSRWEKWQRLVERGR 
SLDDARKRAKQFHEAWS KLMEWLEESEKSLDSELE IANDPDKI K 
TQIAQHK^FQKSLGAKHSVYDTTNRTGRSLKEKTSLADDNLKLD 
DMLS E LRDKWDTI CG KS VE RQNKL E E A \ L L FS GQFTDALQAL I D 
WLYRVEPQLAEDQPVHGD I DLVMNLIDNHKAFQKELGKRTSSVQ 
ALKRS AREL I EGSRDDSS WVKVQMQELS TRWETVCALS I S KQTR 
LEAAIiRQAEEFHSVVHALLEWLAEAEQTLRFHGVLPDDEDALRT 
LIDQHKEFMKKLEEKRAELNKATTMGDTVLAI CHPDS ITTIKHW 
I T I I RARFEE VLAWAKQHQQR LAS ALAGL I AKQ E LLEALLAWLQ 
WAETTLTDKD KEVI PQE I EEVKALIAEHQTFMEEMTRKQPD VDK 
VTKTYKRRAADPSSLQSHIPVLDKGRAGRKRFPASSLYPSGSOT 
QIETKNPRVNLLVSKWQQVWLLAIjERRRKLNDALDRLEELREFA 
NFDFD I WRKKYMRWMNHKKSRVMDFFRRIDKDQDGKITRQE FID 
GILSSKFPTSRLEMSAVADIFDRDGDGYIDYYEFVAALHPNKDA 
YKPITDADKIEDEVTRQVAKCKCAKRFQVEQIGDNKYRFFLGNQ 
FGDSQQLRLVR I LRSTVMVRVGGGWMALDEFLVKNDPCRAKGRT 
NMELREKFILADGASQGMAAFRPRGRRSRPSSRGASPNRSTSVS 
SQAAQAAS PQVPATTT P KI LHPLTRNYGKPWLTNS KMSTPCKAA 
ECSDFPVPSABGTPIQGSKLRLPGYLSGKGFHSGEDSGLITTAA 
ARVRTQFADSKKTPSRPGSRAGSKAGSRASSRRGSDASDFDISE 
IQSVCSDVETVPQTHRPTPRAGSRPSTAKPSKIPTPQRKSPASK 
LDKSSKR 


6029 


1 


3533 


IMPCGSSRLLRGCWTHPNEPVSDLSYFDCIESVMENSKVLGESM 
AG I S QNAKTGD L PAFGE C VG IAS KAL CGLTEAAAQAAYLVG I FD 
PNSQAGHQGLVDP IQFARANQAIQMACQNLVDPGS S PSQVLS AA 
T I VAKHTSALCNACR IAS S KTANPVAKRH F VQS AKE VANSTANL 
VKTIKALDGDFSEDNRNKCRIATAPLIEAVENLTAFASNPEFVS 
IPAQISSEGSQAQEPILVSAKPMLESSSYLIRTARSLAIMPKDP 
PTWS VLAGHSHTVSDS I KSLITS IRD KAPGQRECDYS IDG INRC 
IRDIEQASLAAVSQSLATRDDISVEALQEQLTSVVQEIGffljIDP 
I ATAARGEAAQLGHKGTQLAS YFEPL I LAAVGVAS KI LDHQQQM 
TVLDQTKTLAESALQMLYAAKEGGGNPKAQHTHDAITEAAQLMK 
EAVDDIMVTLNEAASEVGLVGGMVDAIAEAMSKLDEGTPPEPKG 
T FVDYQTT WKYS KAIAVTAQEMMTKS VTNPEELGGLASQMTSD 
YGHLAFQGQMAAATAE PEE I GFQ I RTRVQDLGHGC I FLVQKAG\ 
ALQVCPTDS YTKRELI ECARAVTE KVS LVLSALQAGNKGTQACI 
TAATAVSGI I ADLDTTI MFATAGTLNAENSETFADHRENI LKTA 
KALVEDTKLLVSGAASTPDKLAQAAQSSAATITQIAEWKLGAA 
SLGSDDPETQWLINAIKDVAKALSDLISATKGAASKPVDDPSM 
YQLKGAAKVMVTNVTS LLKTVKAVEDEATRGTRALEAT I ECI KQ 
E LTVFQS KDVPEKTS S P E ES I RMT KG I TMATAKAVAAGNS CRQB 
DV I ATANLSRKAVS DMLTACKQAS FHPDVSDEVRTRALRFGTEC 
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Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E- 
Glutaraic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, I-lBoleucine, K=Lysine, 
L-Leucine, M»Methionine, N^Asparagine, 
P«Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y= Tyrosine, X=Unknovm, *=Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








TLGYLDLLEHVLV I LQKPTPELKQQLAAFS KRVAGAVTELIQAA ' 
EAMKGTEWVDPEDPTVI AETELLGAAAS I EAAAKKLEQLKPRAK 
PKQADETLDFEEQ I LEAAKS I AAATS AL VKSAS AAQREL VAQG K 
VG S I PANAADDGQWS QGL I SAARM VAAAT S S LCEAANAS VQGHA 
S E EKLIS S AKQ VAASTAQLLVACKVKADQDSEAMRRLQAAGNAV 
KRAS DNLVRAAQKAAFGKADDDDWVKTKFVGGI AQ I IAAQEEM 
LKXERELEEARKKLAQ IRQQQYKFLPTELREDEG 


6030 


3 


1777 


FPGRGSPALQLEVLI CLGLMGLERALNVLAP IFYRNI VNLLTEN 
APWNSIAWTVTSYVFLKFLQGGGTGSTGFVSNLRTFLWIRVQQF 
TSRRVELLIFSHLHELSLRWHLGRRTGEVLRIADRGTSSVTGLL 
S YLVFNVIPTLADI I IGI I YFSMFFNAWFGLIVFLCMSLYLTLT 
IWTEWRTKFRRAMNTQENATRARAVDSLLNFETVKYYNAESYE 
VERYREAII K YQGL E WKS S AS L VLLNQTQNL V I G LGLLAGS LLC 
AYFVT EQKLQ VGD YVL FGT YI I Q L YMP LNW FGT YYRM I QTNFI D 
MENM FDLLKK \ETEVKDLPGAGP FRFQKGR I EFENVHFS YADGR 
ETLQD VSFT VMPGQTLALVGPSGAGKST I LRLLFRFYD I S SGC I 
RIDGQDISQVTQALFRFSHWELCPKDTVLFNDTIADNIRYGRVT 
AGNDEVEAAAQAAGIHDAIMAFPEGYRTQVGERGLKLSGGEKQR 
VAI ARTI LKAPG I ILLDEATS ALDTSNERAIQASLAKVCAMRTT 
I WAHRLS TWNADQ I LVIKDGCI VERGRHEALLSRGGVYADMW 
QLQQGQEETSEDTKPQTMER 


6031 


160 


1694 


LRMS ENLDKSNVNEAGKSKSNDSEEGLEDAVEGADEALQKAI KS 
DSSSPQRVQRPHSSPPRFVTVEELLETARGVTNMALAHEIVVNG 
DFQIKPVELPENSLKKRVKEIVHKAFWDCLSVQLSEDPPAYDHA 
I KLVGEI KETLLS FLLPGHTRLRNQITEVLDLDL I KQEAENGAL 
DI S KLAEFI IGMMGTLCAPARDEE VKKI.KD I KEX VPLFRE r FS V 
LDLMKVDMANFAI SS IR PHLMQQS VE YERKKFQEI LERQ PNS LD 
FVTQWLEEAS EDLMTQKYKHALP VGGMAAGSGDMPRLS P VAVQN 
YAYLKLLKWDHLQRPFPETVLMDQSRFHELQLQ\REQLTILGAV 
LLVTFSMAAPG IS SQADFAEKLKM I VK I LLTDMHLPSFHLKDVL 
TT I GEKVCLE VS S CLSLCGS S PFTTDKETVLKGQI QAVAS PDDP 
IRR I MES R ILTFLETYLASGHQKPLPTVPGGLS PVQRELEEVAI 
KFARLVNYNKMVFCP YYDAI LSKILVRS 


6032 


39 


2415 


AARL CRAQ PT KS AWM I RDLS KM YPQTRH PAPHQ P AQ PFKFT I S E 
S CDR I KEEFQFLQAQ YHSLKLECEKLASE KTEMQRHYVMY YEMS 
YGLN I EMHKQAEI VKRLNAI CAQVI P FLSQEHQQQWQAVERAK 
QVTMAELNAI IGQQQLQAQHLSHGHGLPVPLTPHPSGLQPPAI P 
PIGSSAGLLALSSALGGQSHLPIKDEKKHHDNDHQRDRDSIKSS 
S VSPSAS FRGAEKHRNSADYSSESiCKQKTEEKE IAARYDSDGEK 
SDDNLWD VSNEDPSSPRGS PAHSPRENGLDKTRLLKKDAP IS P 
AS IAS SS S TPSSKS K5LSLNEKSTTP VS KS NTPTPRTDAPTPGS 
NST PGLR P VPGKP PG VDPLAS S LRTPMAVPCP Y PTP FG I VPHAG 
MNGELTS PGAAYAGLHNIS PQMSAAAAAAAAAAAYGRS PWGFD 
PHHHMR VPAI PPNLTG I PGG K PAYS FHVS ADGQMQ P VP FP PDAL 
I GPG I PRHARQ INTLNHGE WCAVTISNPTRHVYTGGKGCVKVW 
DISH PGN KS P VS QLDCLNRDNY I RSCRLL PDGRTL I VGGEASTL 
S I WDLAAPTPR I KAELTSSAPAC YALAI S PDS KVCFS CCSDGNI 
AVWDLHNQTLVRQFO^HTI)GASCIDISNDGTKLWTGGLDNTVRS 
W \ DLREGRQLQQHD / FFTS P VFSLGYCP \TEEWLAVGMENSN\ V 
EVLHVTKPDKYQLHLHESCVLSLKFAHCGKWF\VSTGKDNLLNA 

RATVYBVIY 


6033 


39 


2415 


AARLCRAQPTKSAWM I RDLS KMYPQTRHPAPHQPAQ P FKFT I S E 
S CDRI KE E FQFLQAQ YHSLKLE CEKLAS EKTEMQRHYVMYYEMS 
YG LN IEMHKQAE I VKRLNAI CAQVI P FLSQ EHQQQ WQAVERAK 
QVTMAELNAI IGQQQLQAQHLSHGHGLPVPLTPHPSGLQPPAI P 
P IGSSAGLLALSSALGGQSHLPI KDEKKHHDNDHQRDRDS IKSS 
S VS PSAS FRGAEKHRNSADYS S ES KKQKTEEKE I AAR YDS DGE K 
S DDNL WDVSNED P S S PRGS PAHS PRENGLD KTRLLKKDAP ISP 
ASIASSSSTPSSKSKELSLNEKSTTPVSKSNTPTPRTDAPTPGS 
NSTPGLRPVPGKPPGVD PLAS S LRTPMAVPCPYPT PFG I VPHAG 
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Predicted end 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G«Glycine, 
H-Histidine, I-Isoleucine, K«Lysine, 
L=*Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, TVThreonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








MNGELTS PGAAYAGLHNIS PQMSAAAAAAAAAAAYGRSPVVGFD 
PHHHMRVPAI PPNLTG I PGGKPAYSFHVSADGQMQPVPFPPDAL 
IGPGI PRHARQINTLNHGEVVCAVTISNPTRHVYTGGKGCVKVW 
D I SH PGN KS P VS QLD CLNRDN Y I RS CR LLPDGRTL I VGGEAS TL 
S I WDLAAPTPR I KAEI/TSS APACYALAI S PDSKVCFSCCSDGNI 
AVWDLHNQTLVRQFQGHTDGASCIDISNDGTKLWTGGLDNTVRS 
W\DLREGRQLQQHD/FFTSPVFSLGYCP\TEEWLAVGMENSN\V 
E VLHVT K P DK YQLHLHE S CVLS LKFAHCG KW F \ VS TGKDNLLNA 
W\RTPYG\ASIF\QSKESSS\VLSCDI\SVDDKYIVTGS\GDK\ 
RATVYEVIY 


6034 


2683 


714 


E S GRRRRLKRRR S P C PGTAGG PGE TN PG PGAC PRG P R E E AAAAM 
E I APQEAP P VPGADGD I EEAPAEAGS PS PAS PPADGRLKAAAKR 
VTFPSDEDIVSGAVEPKDPWRHAQNVTVDEVIGAYKQACQKLNC 
RQ I P KLLRQLQE FTDLGHRLDCLDLKGE KLD Y KTCEALEE VFKR 
LQ FKWDLE QTNLDE DG AS AL FDM I E Y YE S ATHLN I S FNKH I GT 
RGWQAAAHMMRKTSCLQYL\DARNTPLLDHSAPFVARALiRIRSS 
LAVLHLENASLSGRPIjMLLATALKMNMNLRELYL\ADNKLNGLQ 
DSAQLGNLLKFNCSLQILDLRNNHVLDSGIAYICEGLKEQRKGL 
VTL\VLWNNQLTHTGMAFLGMTLPHTQSLETLN1jGHNPIGNEGV 
RHLKNGLISNRSVLRLGLASTKLTCEGAVAVAEFIAESPRLLRL 
DLRENEIKTGGLMALSIJ^KVNHSLLRl.DLDREPKKEAVKSFIE 
TQKALLAE I QNGCKRNLVIAREREEKEQPPQLSASMPETTATEP 
QPDDEPAAGVQNGAPSPAPSPDSDSDSDSDGEEEEEEEGERDET 
PSGAIDTRDTGSSEPQPPPEPPRSGPPLPNGLKPEFALALPPEP 
PPGPEVKGGSCGLEHELSCSKNEKELEELLLEASQESGQETL 


6035 


19 


404 


SVTYLGI I LHKNTGALPADPVQLI SQTPTPSTKQQLLSFLGMVG 
YF YLWI PGFAI LTKPLCKLTKENLADAIDPKS FS HS S FRSLKTA 
LENASTLALPDSSQPF\SLHTAEVQGCWEILTQGLGPLPV 


6036 


1745 


356 


LPDVEKLGRRRGRKMDSVEKGAATSVSNPRGRPSRGRPPKLQRN 
SRGGQGRGVEKPPHLAALILARGGSKGIPLKNIKHLAGVPLIGW 
VLRAALDS GAFQS VWVSTDHDE I ENVAKQFGAQVHRRSSEVSKD 
SSTSliDAIIEFLNYHNEVDIVGNIQATSPCLHPTDLQKVAEMIR 
EEGYDS VFS WRRHQFRWSE I QKGVREVTEPLNLNTPAKRPRRQD 
WDGELYENGS FYFAKRHL I EMGYLQGGKMAYYEMRAEHS VDIDV 
DIDWPIAEQRVLRYGYFGKEKLKEIKLLVCNIDGCLTNGHIYVS 
GDQKEIISYDVKDAIGISLLKKSGIEVRLISERACSKQTLSSLK 
LDCKMEVSVSDKLAWDEWRKEMGLCWKEVAYLGNEVSDEECLK 
RVGLSGAPADACSTAQKAVGYI CKCNGGRGA\ IRE FAEHI C\LL 
MEKGLINFMPKNRNLAVNIGEKK 


6037 


2936 


1919 


WTS WWM S S VLT I LLFS LQGNKMLNYS AP S AGG YLL PRKP VGTPA 
GGGFPRRHSVTLPSSKFRQNQLLSSLKGEPAPALSSRDSRFRDR 
SFSEGGERLLPTQKQPGGGQVNSSRYKT\ELCRPFEENGACKYG 
DKCQFAHGIHELRSLTRHPKYKTELCRTFHTIGFCPYGPRCHFI 
HNAEERRALAGARDLS ADRPRLQHS FS FAGFPSAAATAAATGLL 
DS PTS ITPPPI LS ADDLLGS PTLPDGTNNPF\AFS SQELAS L FA 
PSMGLPGGGSPTTFLFRPMSESPHMFDSPPSPQDSLSDQEGYLS 
SSSSSHSGSDSPTLDNSRRLPIFSRLSISDD 


6038 


1450 


426 


S S ALQE FGTRNHT FGVPL PHRRKQ IIS CN I CQLR FNS DSQAAAH 
YKGT KHAKKLKALE AMKNKQKS VTAKDSAKTTFTS I TTNT I NTS 
aU&iLXj iJ\i*it>AXb 1 1 L 1 vhlRKbb VMTTEITSKVEKSPTTATG 
NSSCPSTETEEEKAKRLL\YCSLCKVAVNSASQLEAHNSGTKHK 
TMLEARNGSGT I KAFPRAG VKGKGPVNKGNTTGLQNKTFHCE I CD 
VHVNSETQLKQH I S SRRHKDRAAGKPPKPKYS P YNKLQKTAHPL 
GVKLVFSKEPSKPLAPRILPNPLAAAAAAAAVAVSSPFSLRTAP 
AATLFQTSALPPALLRPAPGPIRTAHTPVLFAPY 


6039 


4073 


1000 


LDEYEARLTLANLDDFEEDNEDDDENRVNQEEKAAKITELINKL 
N FLDEAE KD LAT VNS N P FDD P DAAELNP FGD PDS EE P I TETAS P 
RKTEDS FYNNS YNP FKEVQTPQYIiNPFDEPEAFVT I KDSPPQST 
KKKNIRPVDMSKYLYADSSKTEEEELDESNPFYEPKSTPPPNNL 
VN P VQB LE T E RR VKRKAP AP PVLS P KTG VLNENTVS AG KDLS TS 
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Predicted end 
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amino acid 
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Amino acid segment containing signal peptide 
(A^Alanine, C-Cyeteine, D=*Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K^Lysine, 
L^Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
"-Tryptophan, Y=Tyrosine, X~Unknovm, *=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








PKPSPIPS P VLGRKPNASQS LLVWCKEVTKNYRGVKI TNFTTS W 
RNGLSFCAILHHFRPDLIDYKSLNPQDIKENNKKAYDGFASIGI 
SRLLEPSDMVLLAI FDKLTVMTYL YQ IRAHFSGQELNWQ I EEN 
S S KST YKVGNYETDTNS S VDQEKF YAELSDLKREP ELQQ P I SGA 
VDFLSQDDSVFVNDSGVGESESEHQTPDDHLSPSTASPYCRRTK 
SDTEPQKSQQSSGRTSGSDDPGICSNTDSTQAQVLLGKKRLLKA 
ETLELSDLYVSDKKKDMSPPFICEETDEQKLQTLDIGSNLEKEK 
LENSRSLECRSDPESPIKKTSLSPTSKLGYSYSRDLDLAKKKHA 
SLRQTESDPDADRTTLNHADHSSKIVQHRLLSRQEELKERARVL 
LEQARRDAALKAGNKHNTNTAAPFCNRQLSDQQDEERRRQLRER 
ARQLIAEARSGGKMSELPSYGERAAEKLKERSKASGDENDNIEI 
DTNE E I PEG FWGGGD E LTNLENDL DTPEQNS KLVDL KL KKLLE 
VQ PQ VANS P S SAAQKAVTES SEQDMKSGTE DLRTERLQKTTERF 
RNPWFSKDSTVRKTQLQSFSQYIENRPEMKRQRSIQEDTKKGN 
EEKAAITETQRKPSEDEVLNKGFKDS \SQYWGELAALENEQKQ 
I DTRAAL VE KRLRYLMDTGRNTE E E E AMMQE W FMLVNK KNAL IR 
RKNQLSLLE KEHDLERRYELLNRELRAMLA I EDWQKTEAQKRRE 
QLLLDELVALVNXRDALVRDL.DAQEXOAEEEDEHLERTLEQNKG 
KMAKKEEKCVLQ 


6040 


475 


1052 


PTALMTAPSCAFPVQFRQPSVSGLSQITKSLYISNGVAANNKLM 
LS S NQ I TMVT NVS VE WNTL Y ED I Q YMQVP VAD S PNS RL CD F FD 
PIADHIHSVEMKQGR\TLLHCAAGVSRSAALCLAYLMKYHAMSL 
LDAHTWTKS CRP 1 1 RPNS GFWEQL IHYE FQLFGKNT VHM VS S PV 
GM I PD I YEKEVRLM I PL 


6041 


2 


3886 


TEKDEKTAHNLENVLIHFWERLSEICVAKISEPEADVESVLGVS 
NLLQVLQKPKGSLKSSKKKNGKVRFADEI LESNKENEKCVSS EG 
EKIECWELTTEPSLTHNSSGLLSPLRKKPLEDLVCKLADISINY 
VNERKS EQHLRFLS TLLDS FS SSRVFKMLLGDEKQS I VQAKPLE 
IAKLVQKNPAVQFLYQKLIGMLNEDQRKDFGFLVDILYSALRCC 
DNDMERKKVIJDDLTKVDLKWNSLLKIIEKACPSSDKHALVTPWL 
KGD I LGE KLVNLADCLCNEDLES R VS SE S HFSER WTLLS L VL S Q 
HVKNDYLI GDVYVER 1 1 VRLHETLFKTKKLSEAES SDSSVSFI C 
DVAYNYFSSAKGCLLMPSSEDLLLTLFQLCAQSKEKTHLPDFLI 
CKLKNTWLSGVNLLVHQTDSSYKESTFLHLSALWLKNQVQASSL 
D I NS LQ VLLS AVDDLLNTLLE S E DS YLMG VYIGS VM PNDSEWE K 
MRQSLPMQVOiHRPLLEGRLSLNYECFKTDFKEQDIKTLPSHLCT 
SALLSKMVLIALRKETVLENNELEKIIAELLYSLQWCEELDNPP 
IFLIGFCEIIiQKMNITYDNLRVLGNMSGLLQLLFNRSREHGTLW 
SLIIAKLILSRSISSDEVKPHYKRKESFFPIiTEGNLHTIQSLCP 
FLSKEEKKEFSAQCI PALLGWTKKDLCSTNGGFGHLAI FNSCLQ 
TKS IDDGELLHGILKI I ISWKKEHEDIFLFSCNLSEAS PEVLGV 
NI E I I RFLSLFLKYCSS PLAESEWDFIMCSMLAWLETTSENQAL 
YS I PLVQLFACVSCDLACDLSAFFDSrTLDT IGNLP VNL I SEWK 
EFFSQGIHSLLLPILVTVTGENKDVSETSFQNAMLKPMCETLTY 
I S KEQLLS H KL P ARL VADQ KTNLPE YLQTLLNTLAPL LL FRARP 
VQ I AVYHMLYKLMPELPQYDQDNLXS YGDEEEEPALS PPAALMS 
LLSIQEDLLENVLGCIPVGOrVTIKPLSEDFCYVLGYLLTWKLI 
LTFFKAASSQl4RALYSMYLRKTKSIiNKLLYHLFRLMPENPTYAB 
TAVE V PNXD P KT FFTE E LQLS I RETTM LP YH I PHLACS VYHMTL 
KDLPAMVRLWWNS SEKRVFNI VDRFTS KYVS S VLSFQE I S S VQT 
STQLFNGMTVKARATTREVMATYTIEDIVIELIIQLPSNYPLGS 
1 1 VE S G KR VGVAVQQWRNWMLQLS T YLTHQNGS I MEGLALWKNN 
VDKRFEGVEDCMICFSVIHGFNYSLPKKACRTCKKKFHSA\CLY 
KWFTSSNKSTCSLCRETFF 


6042 


1306 


253 


MAELAPASPSD I KASVSNGDTTL LCSRRQSCGMNE VRQVSLT YP 
G S PAP S HS L PLQ PRSGGSLCPS RAW / P D PHQL FDDTS S AQSRG Y 
GAQRAPGGLSYPAASPTPHAAFLADPVSNMAMAYGSSLAAQGKE 
LVDKNIDRFI P ITKLFCYYFAVDTMYVGRKLGLLFFPYLHQDWEV 
QYQQDTPVAPRFDVNAPDLYIPAMAFITYVLVAGLALGTQDRFS 
PDLIiGLQASSAIAWLTLEVliAILLSLYLVTVNTDLTTIDLVAFL 
GYKYVGMIGGVLMGLLFGKIGYYLVLGWCCVAIFVFMIRTLRLK 
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Amino acid segment containing signal peptide 
(A«Alanine, C-Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P*Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X« Unknown, **Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








I LAD AAAE G V P VRGARNQLRMYLTMAVAAAQ PMLM YWLTFHL VR 


6043 


403 1 


599 


LCLFFPFPCATPVLPLPSLI SAL/ CLSHLSVSSWFCPCQPPLPC 
PLPPLQNKTAKGSLSTEQSERG 


6044 


793 


412 


KLEMWNFTLI SKVKI SREVTMI AS KFG IGQQVRHS LLGYLGVW 
D I DPVYS LS E PS P DELAVNDELRAAPW YHWMEDDNGL P VHT YL 
AEAQLSSELQDEHP\EQPSMDELAQTIRKQIiQAPRLRN 1 


6045 


155 


2299 


S P L PQVAAMNYLRRRLS DSN FMANLPNG YMTD LQR PQPPPPPPG 
AHSPGATPGPGTATAERSSGVAPAASPAAPSPGSSGGGGFFSSL 
SNAV KQTTAAAAAT FSEQVGGG SGGAGRGGAASRVLLVI DEPHT 
DWAKYFKGKKIHGEIDIKVEQAEFSDLNLVAHANGGFSVDMEVL 
RNG VKWRS LKPDF VL I RQHAFS MARNGD YRS L VI GLQ YAGI PS 
VNSLHSVYNFCDKPWVFAQMVRLHKKLGTEEFPLIDQTFYPNHK 
BMLSS \TTYPVWKMGHGTLWGWGKVKVDNQHDFQD I AS WALT 
KTYATAEPFIDAKYDVRVQKIGQNYKAYMRTSVSGNWKTNTX3SA 
MLEQ IAMSDR YKLWVDTCSE I FGGLD I CAVEALHG KDGRDH I I E 
WGSSMPLIGDHQDEDKQLIVELVVNKMAQALPRQRQRDASPGR 
GSHGQTPSPGALPLGRQTSQQPAGPPAQQRPPPQGGPPQPGPGP 
QRQGPPLQQRPPPQGQQHLSGLGPPAGSPLPQRLPSPTSAPQQP 
AS QAAPPTQGQGRQ S R P VAGG P GAP P AARPPAS PS PQRQAGPPQ 
ATRQTSVSGPAPPKASGAPPGGQQRQGPPQKPPGPAGPTRQASQ 
AGPVPRTGPPTTQQPRPSGPGPAGRPKPQLAQKPSQDVPPPATA 
AAGGPPHPQLNKSQSLTNAFNLPEPAPPRPSLSQDEVKAETIRS 
LRKSFASLFSD 


6046 


212 


1075 


EGLTGPCERVPFLLGRGPPHGATRAGHRRAVRWAGPESLPPLPR 
SLIMDSPRAGTHQGPLDAETEVGADRCTSTAYQEQRPQVEQVGK 
QAPLS PGLPAMGGPGPGP CEDPAGAGGAGAGGSE PLVTVTVQCA 
FTVALRARRGADLSSLRALLGQALPHQ\AQLGQLSYIiAPGEDGH 
WW I PEEESLQRAWQDAAACPRGLQLQCRGAGGRPVLYQWAQH 
SYSAOGPEDLGFRQGDTVDVLCEVDQAWLEGHCDGRIGIFPKCF 
WP AG PRMSGAPGRL PRS QQGDQ P 


6047 


49 


1405 


PVLVTSLRMREADTLRPPQLMEVSADIISTVEFNHTGELLATGD 
KGGRWIFQREPESKNAPHSQGEYDVYSTFQSHEPEFDYLKSLE 
IEEKINKIKWLPQQNAAHSLLSTNDKTIKLWKITERDKRPEGYN 
LKDEEGKLKDLSTVTSLQVP VLKPMDLMVEVS PRR I FANGHTYH 
INSISVNSDCETYMSADDLRINLWHLAITDRSFTP\NIVDIKPA 
NMEDLTE VI TAS EFHPHHCNLFVYS S S KGS LRLCDMRAAALCDK 
HSKLFEEPEDPSNRSFFSEIIS\SVSDVKFSHSDRYMLTR\DYL 
T VKVWDL\NMEARP I ETYQVHDYLRS KLCSLYENDCIFDKFECA 
WNGS DS V I MTGA\ YNN F FRM FDRNTKRD VTL \ E AS RESS KPRAV 
LKPRRVCVGGKRRRDDI S VDSLDFTKKI LHTAWHPAEN I IAIAA 
TNNLYI FQDKVNSDMH 


6048 


1 


3194 


GIRTPKFCDSPTSDLEMRNGRGRGKRMRPNSNTPVNETATASDS 
KGTSNSSKTRAGANSKGRRGSQNSSEHRPPASSTSEDVKASPSS 
ANKRKNKPLSDMELNSSSEDSKGSKRVRTNSMGSATGPLPGTKV 
E PT VLDRNCPS PVLIDCPHPNCNKKYKHINGLKYHQAHAHTDDD 
SKPEADGDSEYGEEPILHADLGSCNG\ASVSQK\GSLSPARSAT 
PKVRLVEPHSPSPSSKFSTKGLCKKKLSGEGDTDLGALSNDGSD 
DGPSVMDETSNDAFDSLERKCMEKEKCKKPSSLKPEKIPSKSLK 
SARPI/APLAIPPQQIYTFQTATFTAASPGSSSGLTATVAQAMP 
NSPQLKPIQPKPTVMGEPFTVNPALTPAKDKKKKDKKKKESSKE 
LESPLTPGKVCRAEEGKSPFRESSGNGMKMEGLLNGSSDPHQSR 
LAS I KAEADKI YS FTDNAPS PS IGGS SRLENTTPTQPLTPLHW 
TQNG AE AS S VKTNS PAYS D I S DAGEDG EG KVDS VKS KDAEQL VK 
EGAKKTLF PPQPQS KDS P YYQGFES YYS PS YAQSS PGALNPSSQ 
AGVESQALKTKRDEEPESIEGKVKNDICEEKKPELSSSSQQPSV 
I QQR PNM YMQSLY YNQ YAYVP P YG YS DQ S YHTHLLSTNTAYRQQ 
YE EQQKRQS LEQQQ RG VDKKAEMGLKE RE AALKEE WKQKPS I P P 
TL T KAPS LTDLVKS G P G KAKE PGADP AKS VI I PKLDDS S KLPGQ 
AP EGLKVKLSDASHLS KEAS EAKTGAECGRQAEMDP I LWYRQEA 
E PRMWTYVYPAKYSDI KSEDERWKEERDRKLKEERSRS KDSVPK 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F- Phenyl alanine, G=Glycine, 
H=Hietidine, I-Ieoleucine , K=Lysine, 
L-Leucine, M=Methionine, N-Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EDGKESTSSDCKLPTSEESRLGSKEPRPSVHVPVSSPLTQHQSY 
IPYMHGYSYSQSYDPNHPSYRSMPAVMMQNYPGSYLPSSYSFSP 
YGSKVSGGEDADKARASPSVTCKSSSESKALDILQQHASHYKSK 
SPTISDKTSQERDRGGCGWGGGGSCSSVGGASGGERSVDRPRT 
SPSQRLMSTHHHHHHLGYSLLPAQYNLPYAAGLSSTAIVASQQG 
STPSLYPPPRR 


6049 


215 


1089 


AMTG VFDRRVPS IRSGDFQAPFQTSAAMHH PSQES PTLPES S AT 
DSDYYS PTGGAPHG YCS PTSAS YG\ KALN P YQ YQYHG VNGS AGS 
Y PAKAYAD YS YASS YHQYGGAYNRVPSATNQPEKEVTE P EVRMV 
NGKPKKVRKPRTIYSSFQLAALQRRFQKTQYLALPERAELAASL 
GLTQTQVKIWFQNKRSKIKKIMKNGEMPPEHSPSSSDPMACNSP 
QS PAVWEPQGSSRSLSHHPHAHP PTSNQS PASS YLENSASWYTS 
AASSINSHLPPPGSLQHPLALASGTLY 


6050 


566 


1718 


KG LE RTCCAME ESDS E KTTE KENLG PRMD P P LGE PG \GS LGWVL " 
PNTAMKKKVLLMGKSGSGKTS MRS 1 1 FANYI ARDTRRLGATILD 
RIHSLQINSSLSTYSLVDSVGNTKTFDVEHSHVRFLGNLVLNLW 
DCGGQDTFMENYFTSQRDNIFRNVEVLIYVFDVESRELEKDMHY 
YQS CLEAI LQNS PDAKI FCLVH KMDLVQEDQRDLI FKEREEDLR 
RL.S RPLECS CFRTS I WDETLYKAWS S IVYQL I PNVQQLEMNLRN 
FAE 1 1 EADE VLLFERATFLVI SHYQCKEQRDAHRFEKISNI I KQ 
FKLSCSKIAASFQSMEVRNSNFAAFIDIFTSNTYVMVVMSDPSI 
PSAATLINIRNARKHFEKLERVDGPKQCLLMR 


6051 




1718 


KGLERTCCAMEESDSEKTTEKBNLGPRMDPPLGEPG\GSLGWVL 
PNTAMKKKVLLMGKSGSGKTSMRS 1 1 FANYI ARDTRRLGATILD 
RI HSLQINS S LSTYS LVDS VGNTKTFDVEHSH VRFLGNLVLNLW 
DCGGQDTFMENYFTSQRDNIFRNVEVLIYVFDVESRELEKDMHY 
YQSCLEAILQNSPDAKIFCLVHKMDLVQEDQRDLI FKEREEDLR 
RLSRPLECS CFRTS IWDETLYKAWSS IVYQL I PNVQQLEMNLRN 
FAE I IEADEVLLFERATFLVISHYQCKEQRDAHRFEKI SNI I KQ 
FKLS CS KLAAS FQSME VRNSNFAAF I DI FTSNT YVMWMSD PS I 
PSAATLINIRNARKHFEKLERVDGPKQCLLMR 


6052 


566 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLGEPG\GSLGWVL 
PNTAMKKKVLLMGKSGSGKTSMRS I I FANYI ARDTRRLGATILD 
RIHSLQINSSLSTYSLVDSVGNTKTFDVEHSHVRFLGNLVLNLW 
DCGGQDTFMENYFTSQRDN I FRNVEVLI YVFDVE SRELEKDMHY 
YQSCLEAILQNS PDAKI FCLVHKMDLVQEDQRDLI FKEREEDLR 
RLSRPLECS CFRTS I WDETLYKAWSS I VYQL I PNVQQLEMNLRN 
FAE I IEADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI I KQ 
FKLS CS KLAAS FQSMEVRNSNFAAFIDI FTSNT YVMWMSDPS I 
PSAATLINIRNARKHFEKLERVDGPKQCLLMR 


6053 


201 


1704 


KGTEMNKSRWQSRRRHGRRSHQQNPWFRLRDSEDRSDSRAAQPA 
HDSGHGDDESPSTSSGTAGTSSVPELPGFYFDPEKKRYFRLLPG 
HNNCNPLTKESIRQKEMESKRLRLLQEEDRRKKIARMGFNASSM 
LRKSQLG FLNVTNYCHLAHE LRLSCMER KKVQIRS MD PS ALAS D 
RFNLIIjADTNSDRLFTVNDVTVGGSKYGIINLQSLKTPTLKVFM 

henlyftnrkv\nsvcwaslnhldshillclmglaetpgcatll 
paslfvnshpagidrpg\mlcsfripgawscawslniqanncfs 
tglsrrvlltnvvtghrqsfgtnsdvlaqqfalmapllfngcrs 
geifaidlrcgnqgkgwkatrlfhdsavtsvrilqdeqylmasd 

MAGKIKLWDLRTTKCVRQYEGHVNEYAYLPLHVHEE EG I LVAVG 
QDCYTR I WSLHDARLLRTI PS P YPAS KAD I PSVAFS srlggs rg 
APGLLMAVGQDLYCYSYS 


6054 


1 


1054 


PPIARIX3EFGTSRRHMAAPSGVHLLVRRGSHRIFSSPLNHIYLH 
KQSSSQQRRNFFFRRQRDISHSIVLPAAVSSAHPVPKHIKKPDY 
VTTG I VPDWGDS I E VKNEDQ IQGLHQACQLARHVLLLAGKSLKV 
DMTTEE I DALVHRE 1 1 SHNAYPS PLGYGGFP KS VCTSVNNVLCH 
G I PDSRPLQDGDI INI DVTVYYNG YHGDTSETFLVGNVDECGKK 
L VEVARRCRDEAIAACRAGAP FS VIGNT I SH ITHQNGFQ VCPHF 
VGHGIGSYFHGHPEIWHHANDSDLPMEEGMAFTIEPIITEGSPE 
FKVLEDAWTVVSI^/TSKVSAQFEIHTVLITSRGAQILTKLPHEA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenylalanine, G»Glycine, 
H^Histidine, I=Ieoleucine, K-Lysine, 
L-Leucine, M-Methionine, N=Asparagine, 
P-Proline, Q=Glut amine, R^Axginine, 
S-Serine, T=Threonine, WValine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /=posslble nucleotide deletion, 
\=possible nucleotide insertion) 


6055 


421 


2364 


P PYFLLS FLAWWLYGQSDRTBTDI SQSAGPPPGTLQCSALHHDP 
GCANCSRFCRDCSPPACQCHTHVFPGNALNGVQPPELSRTLALI 
SSRE P PRKKKKSQTETGKERERTS FliTQGGKRFELQHGLAG I CM 
TLLITGDS I VSAEAVWDHVTMANRELAFKAGDVIKVLDASNKDW 
W WGQ I DDE EGW F PAS FVRLWVNHEDEVEEG P S D VQNGHLD PNS D 
CLCLGRPLQNRDQMRANVINE IMSTERHYI KHLKDICEGYLKQC 
RKRRDMFSDEQLKVI FGNI EDI YRFQMGFVRDLEKQYNNDDPHL 
SEIGPCFLEHQDGFWIYSEYCNNHLDACMELSKI24KDSRYQHFF 
EACR L LQQM I D I A\ I DG FLLTP VQ K I CKYPLQ LAELL KYTAQDH 
SDYRYVAAALAVMRNVTQQINERKRRLENIDKIAQWQASVLDWE 
GED I LDRS S E L I YTGEMAW I YQ P \ YGRNQQRVF FLFDHQMVIjCK 
KDL I RRDI L Y YKGRIDMDKYEWD I EDGRDDDFNVSMKNAFKLH 
NKETEEIHLFFAKKLEEKIRWLRAFREERKMVQEDEKIGFEISE 
NQ KRQ AAMTVRKV PKQ KG VNS ARS VP PS Y P P P QDPLNHGQ YLVP 
\DGIAQSQVFEFTEPKRSQSPFWQNFSRLTPFKK 


6056 


43 


3356 


SGGRGPVRVRSEQLS PSAEQVSQ ISO. I SLGRRPLSSLP PPPSRA 
LAPTRAPDTALT I ME VAE VE S PLNP S CKI MT FRP SME E FREFNK 
YLA YME SKG AHRAGLAKV I P PKE WKPRQC YDD I DNLL I PAP I QQ 
MVTGQSGLFTQYNIQKKAMTVKEFRQLANSGKYCTPRYLDYEDL 
ERKYWKNLTFVAP I YGAD INGS I YDEG VDEWNI ARLNTVLDWE 
EECG I S I EGVNTP YLYFGMWKTTFAWHTEDMDLYS I NYLHFGEP 
KSWYAIPPEHGKRLERLAQGFFPSSSQGCDAFLRHKMTLISPSV 
LKKYG I PFDK I TQEAGE FMI TFP YG YHAGFNHGFNCAES TNFAT 
VRW I DYGKVAKLCTCRKDMVKISMD I FVRKFQPDR YQLWKQGKD 
I YT I DHTKPT PASTPEVKAWLQRRRKVRKAS RS FQ CARS TS KRP 
KADEE E E VS DE VDG AEVPNPD S VTDDLKVSE KS EAA VKLRNTEA 
S S E E E S S ASRMQ VEQNLS DH I KLS GNS CLS TS VTED I KTEDDKA 
YAYRSVPSISSEADDSIPLSTGYEKPEKSDPSELSWPKSPESCS 
SVAESNGVLTEGEESDVESHGNGLEPGEIPAVPSGERNSFKVPS 
IAEGENKTSKSWRHPLSRPPARSPMTLVKQQAPSDEELPEVLSI 
EEEVEETESWAKPLIHLWQTKPPNFAAEQEYNATVARMKPHCAI 
CTLLMPYHKPDSSNEENDARWETKLDEWrSEGKTKPLIPEMCF 
IYSEENIEYSPPNAFLEEDGTSLLISCAKCCVRVHASCYGIPSH 
EICDGWLCARCKRNAWTAECCLCNLRGGALKQTKNNKWAHVMCA 
VAVPEVRFTNVPERTQIDVGRI PLQRLKLKCI FCRHRVKRVSGA 
C I QCS YGRCPAS FHVTCAHAAG VL\ ME PDDWP YWN I TCFRHKV 
NPNVKS KACEKVIS VGQTV ITKHRNTR Y YSCRVMAVTS QTFYE V 
MFDDGSFSRDTFPEDIVSRDCLKLGPPAEGEWQVKWPDGKLYG 
AKYFGSNIAHMYQVEFEDGSQIAMKREDIYTLDEELPKRVKARF 
VS AGRCHLGTCQVNS LS S PHVS QAQOETYLGFW INS KKSQCN I F 
LSGTY 


60S7 


1 


853 


FVARLKEQEGEGGLGPRKEKGRARGRERRRKMQLTRCCFVFLVQ 
GS LYLVI CGQDDGP PGS EDPERDDHEGO PRPRVPRKRGHIS PKS 
RPMANSTLLGLLAPPGEAWGILGQPPNRPNHSPPPSAKVKKIFG 
WGDFYSN I KTVALNLLVTGKI VDHGNGTFS VHFQHNATGQGNI S 
I S LVP PS KAVEFHQEQQ I F I EAXASKI FNC \ RME WE KVE \RGRR 
TSLFTHDPAKI CSRDHAQSSATWSCSQP FKWCVYIAFYSTDYR 
LVQKVCPDYNYHSDTPYYPSG 


6058 


1 


986 


HPLPSASLGLP S VS LG VS LCVRS ALLEAWPMLP KRRRARVGS P 
SGDAASSTP PS TRFPGVAI YLVE PRMGRSRRAFLTGLARSKGFR 

ISWLTESLGAGQPVPVECRHRLEVAGPSKGPLSPAWMPAYACQR 
PTPLTHHNTGLSEAIiEILAEAAGFEGSEGRLLTFCRAASVLKAL 
PSPVTTLSQLQGLPHFGEHSSRWQELLEHQVCEEVERVRRSE/ 
RLFTQ I FGVG VKTADRWYREGLRTLDDLREQ PQKLTQQQKAGE P 
S R E AG PWAS LN CTLDP S AS TP 


6059 


2 


3650 


QQDFESLADLTDHRAH RC PGDGDDDPQLS VA7AS S PSS KDVASPT 
QMIGDGCDLGLGEEEGGTGLPYPCQFCDKSFIRLSYLKRHEQIH 
SDKLP FKCTYCSRLFKHKRS RDRH I KLHTGDKKYHCHECE AAFS 
RSDHLKIHLKTHSSSKPFKCTVCKRGFSSTSSLQSHMQAHKKNK 
EHLAKSEKEAKKDDFMCDYCEDTFSQTEELEKHVLTRHPQLSEK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F*Phenylalanine, G=Glycine, 
H=Histidine, I-Isoleucine, K-Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P=Proline, Q-Glut amine, R=Arginine / 
S-Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ADLQCIHCPEVFVDENTLLAH I HQAHANQKHKCPMCPE \QFSS V 
\EGVYCHLDSHRQPDSSNHSVSPDPVLGSVASMSSATPDSSASV 
BRGSTPDS TLKPLRGQKKMRDDGQGWTKWYS CPYCSKRDPNS L 
AVLE IHLKT IHADKPQQSHTCQI CLDSMPTLYNIjNEHVRKLHKN 
HAYPVMQFGNISAFHCNYCPEMFADINSLQEHIRVSHCGPNANP 
S DGNNAF F CNQCSMG FLTE SS LTEH I Q \ Q \ AHCS VGS AKLES P V 
VQ PTQS FME VYSCP YCTNS PI FGS I LKLTKH I KENHKNI PLAHS 
KKSKAEQSPVSSDVEVSSPKRQRLSASANSISNGEYPCNQCDLK 
FSNFESFQTHLKLHLELLLRKQACPQCKEDFDSQESLLQHLTVH 
YMTTSTHYVCESCDKQFSSVDD\LQKH\LLDMPHPLCCTHCT\L 
CQEVFDS \ KVS I \QVHLAVKHSNE KKMYHCTACNWDFRKEADLQ 
VHVKHS HLGN P AKAHKC I FCGETFS TE VELQ CH I TTHS KKYNCK 
FCS KAFHAI I LLE KHLREKHCVFDAATENGTANGVPPMATKKAE 
PADLQGMLLKNPE APNS HEAS E D DVD AS E PM YGCD I CG AAYTME 
VLLQNHRLRDHN IRPGEDDGSR KKAEF I KG S HXCNVCS RTFFS E 
NGLREHLQTHRGPAKH YMCPI CGERFPSLLTLTEHKVTHS KSLD 
TGT CR I CKM P LQS EEE F I EHCQMHPDLRNS LTGFRCW CMQTVT 
STLEL KIHGT FHMQKLAGS S AAS S PNGQGLQKL YKCALCLKE FR 
S KQDLVKLD VNGLP YGL CAGCMARS ANGGVGG LAP PE PADR P CA 
GLRCPECSVKFESAEDLESHMQVDHRDLTPETSGPRKGTQTSPV 
PRKKTYQCIKCQMTFENEREIQIHVANHMIEEGINHECKLCNQM 
FDSPAKLLCHLIEHSFEGMGGTFKCPVCFTVFVQANKLQQHIFA 
VHGQEDKIYDCSQCPQKFFFQTELQNHTMSQHAQ 


6060 


2145 


202 


SYEIVGKNKLEVNHSQLKALCKCSLPSRLLPLGENLPLLDRGFR 
KEPRS RGSRERDNMLHLHHS CLCFRSWLPAMLAVLLSLAPSASS 
DISAS R PNILLLMADDLG I GDIGCYGNNTMRTPNIDRLAEDG VK 
LTQHI SAASLCTPSRAAFLTGRYPVRSGMVS S IGYRVLQWTGAS 
GGLPTNETTFAKI LEEKGYATGLI GKWHLGLNCES ASDHCHHP L 
HHGFDHFYGMPFSLMGDCARWELSEKRVNLEQKLNFLFQVLALV 
ALTLVAGKLTHL I PVSWMP VI WSALSAVLLLASS YFVGALIVHA 
DCFLMRimTITEQPMCFQRTTPLILQEVASFLKRNKHGPFLLFV 
S FLHVHI PLITMENFLGKSLHGLYGDNVKEMDWMVGRILDTLDV 
EGLSNS TLI YFTSDHGGS LENQLGNTQYGGWNG I YKGGKGMGGW 
EGGIRVPGIFRWPGVLPAGRVIGEPTSLMDVFPTVVRLAGSEVP 
QDRVIDGQDLLPLLI^TAQHSDHEFLMHYCERFIjHAARWHQRDR 
GTMWKVHFVTPVFQPEGAGACYGRKVCPCFGEKVVHHDPPLLFD 
LSRDPSETHILTPASEPVFYQVMERNVQQAVWEHQRTLSPVPLQ 
JjUKi/jWlWKFWLtUPLLbPr PLCWCLREDDPQ 


6061 


110 


1330 


MN IHMKRKT I KN INTFENRMLMLDGMPAVRVKTELLES EQGS PN 
VHNYPDMEAVPLLLNNVKGEP PEDSLS VDHFQTQTE PVDLS INK 
ARTSPTAVSSSPVSMTASASSPSSTSTSSSSSSRLASSPTVITS 
VSSAS S SSTVLTPGPLVASASGVGGQQFLH I IHP VP PS S PMNLQ 
SNKLSHVHR I PWVQS VPWYTAVRS PGNVNNT I WPLLEDGRG 
HGKAQMDPRGLSPRQSKSDSDDDDLPNVTLDSVNETGSTALSIA 
RAVQE VH P S P VSR VRGNRMNNQK F PCS ISPFSIES TRRQRTVLN 

KCTWEGCTWKFARSDELTRHYRKHTGVKPFKCADCDRSFSRSDH 
LALHRRRHMLV 


6062 


71 


1079 


ETMAKNG PENCEDCHI LNAEAFKS KKI CKSLKICGLVFGI LALT 
LIVLFWGSlOiFWPEVPKKAYDMEHTFYSNGEKKKIYMEIDPVTR 

EFSEPEEEIDENEEITTTFFEQSVIWVPAEKPIENRDFLKNSKI 
LE I CDNVTMYW\ INPTL\ ISGTFAKQLHHNFAFI I LVS ELQDFE 
EEGEDLHFPANEKKGIEQNEQWWPQVKVEKTRHARQASEEELP 
INDYTENG I E FDPMLDERG YCCI YCRRGNRYCRRVCE PLLGYYP 
YPYCYQGGRVICRVIMPCNWWVARMLGRV 


6063 


71 


1079 


ETMAKNG PENCEDCH I LNAEAFKS KKI CKSLKI CGLVFGI LALT 
LIVLFWGSKHFWPEVPKKAYDMEHTFYSNGEKKKIYMEIDPVTR 
TE I FRSGNGTDETLE VHDFKNG YTG I YFVGLQKCF I KTQI KVI P 
EFSEPEEEIDENEEITTTFFEQSVIWVPAEKPIENRDFLKNSKI 
LE I CDNVTMYW\ INPTL\ I SGTFAKQLHHNFAF 1 1 LVS ELQDFE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 

corresponding 
to first 
amino acid 
residue of 

aiiij.nu J.U 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acxd segment containing signal peptide 
(A^Alanine, C-Cysteine, D=Aspartic Acid, Es 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H-Hietidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








E EG EDLH F P ANE KKG I EQNEQWVVPQ VKVEKTRHARQAS EE E L P 
INDYTENGIEFDPMLDERGYCCIYCRRGNRYCRRVCEPLLGYYP 
YP YCYQGGR VI CR VIMPCNWWVARMLGRV 


6064 


913 


311 


NLPQSLPRPTEHSPPYSLEKMTDIjVAVWDVALSDGVHKIEFEHG 
TTSGKRWYVDGKEEIRKEWMFKLVGKETFYVGAAKTKATINID 
AISGFAYEYTLE1NGKSLKKYMEDRSKTTNTWVLHMDGENFRIV 
LEKDAMDVWCNGKKLETAGEFVDDGTETHFS IGTH\ACYIKAV\ 
SSG\KRKEGI IHTLIVDNREIPEIAS 


6065 


1153 


641 


M S VR VAR VAWVRGLGAS Y RRG AS S FP VP P PG AQGVAE LLRDATG 
AEEEAPWAATERRMPGQCSVLLFPGQGSQWGMGRGLLNYPRVR 
E L YAAARR VLG YDLLE LS LHG P QE TLDRTVH CQ P AI F VAS LAAV 
E KLHHLQ P S V I ENCVAAAGFS VGE FAALVF AGAME FAEG 


6066 


68 


3470 


VKENMPATRKPMRYGHTEGHTEVCFDDSGSFIVTCGSDGDVRIW 

EDLDDDDPKFINVGEKAYSCALKSGKLVTAVSNNTIQVHTFPEG 

VPDGIIjTRFTTNANHWFNGDGTKIAAGSSD\FLVKIVDVMDSS 

QQKTFRGHDAPVLSLS FDPKDI FLASASCDGS VRVWQISDQTCA 

ISWPLLQKCNDVINAKSICRLAWQPKSGKLLAIPVEKSVKLYRR 

ES WSHQFDLSDNFISQTIiNI VTWSPCGQYLAAGS INGL I IVWNV 

ETKDCMERVKHEKGYAICGLAWHPTCGRISYTDAEGNLGLLENV 

CDPSGKTSSSKVSSRVEKDYNDLFDGDDMSNAGDFLNDNAVEIP 

SFSKGIINDDEDDEDLMMASGRPRQRSHILEDDENSVDISMLKT 

GSSLLKEEEEDGQEGSIHNLPLVTSQRPFYDGPMPTPRQKPFQS 

GSTPLHLTHRFMVWNS IGI IRCYNDEQDNAIDVEFHDTS IHHAT 

HL S NT LNYT I ADLSHE AI LLAC E S TD E LAS KLH CLHFS S WDSS K 

EWIIDLPQNEDIEAICLGQGWAAAATSALLLRLFTIGGVQKEVF 

SLAG PWSMAGHGEQL F I VYHRGTGFDGDQCLGVQLLELGKKKK 

QILHGDPLPLTRKSYLAWIGFSAEGTPCYVDSEGIVRMLNRGLG 

NTWTPICNTREHCKGKSDHYWWGIHENPQQLRCIPCKGSRFPP 

TLPRPAVAILSFKLPYCQIATEKGQMEEQFWRSVIFHNHLDYLA 

KNGYEYEESTKNQATKEQQELI^KMIALSCKLEREFRCVELADL 

M TQNAVNLA IKYASRSRKLI LAQ KLS E LAVE KAAE LTATQ VE E E 

EEEEDFRKKLNAGYSNTATEWSQPRFRNQVEEDAEDSGEADDEE 

KPEIHKPGQNS FS KSTNS S DVSAKSGAVTFSSQGRVNPFKVS AS 

SKEPAMSMNSARSTNILDNMGKSSKKSTALSRTTNNEKSPIIKP 

LIPKPKPKQASAASYFQKRNSQTNKTEEVKEENLKNVLSETPAI 

CP PQNTENQR p KTG FQMWLEENRSN I LS DN PDFS DEAD 1 1 KE GM 

IRFRVLSTEERKVWANKAKGETASEGTEAKKRKRWDESDETEN 

QE EKAKENLNL S KKQ KPLD FS TNQKLS AFAFKQE 


60*7 


858 


321 


LPWQRJ^VLI^RGKMAVTGWLESLRTAQKTALIjQDGRRICVHYLF 
PDGKEMAE E YDEKTS ELLVRKWRVKSALGAMGQWQLEVGDPAPL 
GAGNLG PE L I KE S NANP I FMRKD TKMS FQ WR I RNLP Y P KD VYS V 
SVDQKERCI IVRTTNKKYYKKFS IPDLDRHQLPLDDALLSFA\T 
PTAP 


6068 


13 


1730 


GSKMADLANEEKPAIAPPVFVFQKDKGQKSPAEQKNLSDSGEEP 
RG EAEAPHHGTGH PE S AGEHALE P PAPAGAS AS TP P P P APE AQL 
PPFPRELAGRSAGGSSPEGGEDSDREDGNYCPPVKRERTSSLTQ 
FP PS QSEERS SGFRLXP PTLIHGQAPSAGLPS QKPKEQQRS VLR 
P AVLQAPQ P KALSQT VP S SGTNG VS LP ADCTGAVPAAS PDTAAW 
RS PS EAADE VCALE E KE PQKNES SNASE E EACEKKDPATQQAFV 

FGQNLRDRVKL INES VDEADMENAGHPS ADTPTATN YFLQYIS S 

SLENSTNSADA c ;9NK'FVTi'r;r>MM<;TrT?\rT cddb-t HTrifccniMTnoMTv 
> - J j. utoj-ix/j-iooi^ i\.r v r uyW no CK V Lor Jr xvi_iJNij VooJJANRENA 

AAE SGSESSSQ EAT PE KE S LAE S AAAYTKATARKCLLE KVE VI T 
GEEAE SNVLQMQCKL F VFDKTSQS WVERGRGLLRLNDMAS TDDG 
TLQ S R LS D AG PRGS LR \ L I LNTKLWAQMQ I DKAS E K\ S I R I TAM 
DNEDQG VKVFL I SAS S KDTGQVYAALHHR I LALRSRVEQEQE AK 
M PAPE PGAAP SNEEDDS DDDDVLAPS QATAAGAGDEGDGQTTGS 
T 


6069 


583' 


27 


PTRPGQAGSSSAMAAQRLGKRVLSKLQSPSRARGPGGSPGGLQK 
RHARVTVKYDRRELQRRLDVEKW I DGRLEE LYRGMEADMPDE IN 
IDELLELESEEERSRKIQGLLKSCGKPVEDFIQELLAKLQGLHR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D^Aspartic Acid, E=. 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, I=Ifloleucine, K-Lysine, 
L-Leucine, M=Methionine, N=Asparagine , 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W tryptophan, Y= Tyrosine, X=Un3cnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








Q\PGLRQPSPSP\DGQPSAPFQGPGARTASPLTLLALFPGPPER 
RPALLCVLSCI 


6070 


478 


858 


I RVTVDGE FLH Y I FP LQ FLDS P E W / RFTE THRGRHF\ Q VTLTAE 
TDCRYVSWRRKKLYLLFAQHRYISRLFSVLIGSDIADKLYALND 
RVYIGKRYHYDIRLPNFYQMSTPEIRRSPLTQHFQNSRRYW 


6071 


2 


1654 


HEARTKGNKJU^P\VRLFSLVTRLLIAPRRGLTVRSPDEPLPV 
VR I P VALQRQLE QRQS RRRNLPR P VL VRPG P LLVS ARR PE LNQP 
ARLTLGRWERAPLASQGWKSRRARRDHFSIERAQQEAPAVRKLS 
SKGSFADLGAWKPRVLHALQE\AAPEWQ\PTTVQSSTIPSLIjR 
GRHWCAAETG S GKTLS YLIiPLLQRLLG \H PSLD SLP I PAP RGL 
VLVPSRELAQQVRAVAQPLGRSLGLLVRDLEGGHGMRRIRLQLS 
RQPS ADVLVATPGALWKALKSRL I SLEQLS FLVLDEADTLLDES 
FLELVDY I LE KS H I AEG PADLED P FNP KAQLVLVGATFPEGVGQ 
LLNKVASPDAVTTITSSKLHCIMPHVKQTFLRLKGADKVAELVH 
ILKHRDRAERTGPSGTVLVFCNSSSTVNWLGYILDDHKIQHLRL 
OGQMPALMRVGI FQS FQKS SRDILLCTD I ASRGLDSTGVELWN 
YDFPPTLQDYIHRAGRVGRVGSEVPGTVISFVTHPWDVSLVQKI 
ELAARRRRSLPGLASSVXEPLPQAT 


6072 


1 


742 


KMERTEMMPTINSQLEFKSKPFPLVSSSRWLVKRGELTAYVEDT 
VLFSRRTSKQQVYFFLFNDVLIITKKKSEESYNVNDYSLRDQLL 
VES CDNEELNS S PGKNSS TMLYSRQS S ASHLFTLTVLSNHANEK 
VEMLLGAETQSERARWITALGHSSGKPPADRTSLTQVEIVRSFT 
AKQPDELSLQVADWLl\YQRVSDGVfYEGER\LRDGERGWFPME 
CAKE I TCQAT I DKNVERMGRLLGLETNV 


6073 


620 


B60 


PCRRGLARPLSRRPG/SILVHCAVGVSRSATLVLAYLMLYHHLT 
L VEAI KKVKDHRG I I PNRGFLRQLLALDRRLRQGLEA 


6074 


168 


1110 


PGARCMATELQCPDSMPCHNQQVNSASTPSPEQLRPGDLILDHA 
GGNRASRAKVILLTGYAHSSLPAELDSGACGGSSLNSEGNSGSG 
DSSSYDAPAGNSFLEDCELSRQIGAQLKLLPMNDQIRELQTIIR 
DKTASRGDFMFSADRLIRLWEEGLNQLPYKECMVTTPTGYKYE 
GVKFEKGNCGVSIMRSGEAMEQGIiRDCCRSIRIGKILIQSDEET 
QRAKVYYAKFPPD I YRRKVLLM YP ILQTG\NTVIEAVKVLI EHG 
VQPSVI ILLSLFSTPHGAKS 1 1 QEFPEITI LTTEVHPVAPTHFG 
QKYFGTD 


6075 


320 


1091 


PPTCQPQE VEHH \ YGYVP I LGNKTLPSRCHQCVI VSSSSHLLGT 
KLGPE I ERAECT I RMNDAPTTG YS AD VGNKTTYR VVAHS S VFRV 
LRRPQEFVNRTP BTVF I FWGP PS KMQKPQGSLVRVIQRAGLVFP 
NMEAYAVSPGRMRQFDDLFRGETGKDREKSHSWLSTGWFTMVIA 
VELCDHVHVYGMVPPNTYCSQRPRLQRMPYHYYEPKGPDECVTYI 
QNEHSRKGNHHRFITEKRVFSSWAQLYGITFSHPSWT 


6076 


1721 


107 


HPS PTEAPRVQHLTMDCTWR I LFLVAAATGTHAQ VQL VQS GAEV 
KKPGASVKVSCKVSGYTLTELSMHWVRQAPGKGLEWMGAFDPED 
GET I YAQKFQGR VTMTEDTS TDTAYMELS S LRS EDTAVY YCATD 
HGDYAFD I WGQGTMVTVSSAPTKAPDVFP I ISGCRHPKDNSPW 
LACLITGYHPTSV\TVTWYMGTQSQA\QRTFPEIQRRDSYYMTS 
SQLSTPLQQWRQGEYKCWQHTASKSKKEIFRWPESPKAQASSV 
PT AQ PQAEG S LAKATTAPATTRNTGRGGE E KKKE KE KEEQ EERE 
TKTPECPSHTQPLGVYLLTPAVQDLWLRDKATFTCFWGSDLKD 
AHLTWEVAGKVPTGG VE EGLLERHSNGSQ S QHS RLT L P RS LWNA 
GTS VTCTLNHP S LPPQRLMALRE PAAQAP VKLS LNLLASSDPPE 
A\ASWLLCEVSGFSPPNILLMWLEDHGEVNTSGFAPARPLPKP\ 
RSTTFWA\WSVLRVPAPPSPQPATYTCWSH£DSRTLLNASRSL 
EVSYVTDHGPMK 


6077 


3687 


1268 


LLPDMNLQPIFWIGLISSVCCVFAQTDENRCLKANAKSCGECIQ 
AGPNCGWCTNSTFLQEGMPTSARCDDLEALKKKGCPPDDIENPR 
GS KD I KKNKNVTNRS KGTAEKLKPED ITQ IQ PQQLVLRLRSGEP 
QTFTLKFKRAED YP I DLY YLM\ DLS YSMKDDLENVKS LGTDLMN 
EMRRITSDFRIGFGS FVEKTVMPYI STTPAKLRNPCTSEQNCTS 
PFSYKNVLSLTNKGEVFNELVGKQRISGNLDSPEGGFDAIMQVA 
VCGS L IGWRNVTRLLVFS TDAGFHFAGDGKLGG I VLPNDGQCHL 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D-Aspartic Acid, E= 
Glutamic Acid, F-Phenyl alanine, G=Glycine, 
H=»Histidine, I»Isoleucine, K^Lysine, 
L~Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ENNMYTMSHYYDYPSIAHLVQKLSENNIQTIFAVTEEFQPVYKE 
L KNL I P KS AVGTLS ANS SNVIQLII DAYNS LSSEVI LENGKLS E 
G VT I S YQS Y\ CKNG VNGTGBNGRKCSNI S IGDEVQFE I S ITSNK 
CPKKDSDSFKIRPLGFTEEVEVILQYICECECQSEGIPESPKCH 
EGNGTFECGACRCNEGRVGRHCECSTDEVNSEDIGCFTARKENQ 
FQKS ASNHGRVPSAGQCVCRKRDNTNE I YSGKFCECDNFNCDRS 
NGLICGGNGVCKCRVCECNPNYTGSACDCSLDTSTCEASNGQIC 
NG RG I CE CG VC KCTDP KFQGQT CE MCQTCLGVCAE H KE CVQCRA 
FNKGEKKDTCTQECSYFNITKVESRDKLPQPVQPDPVSHCKEKD 
VDDCWFYFTYSVNGNNEVMVHWENPECPTGPDI IP I VAGWAG 
I VL I GLALLL I WKLLM 1 1 HDRRE FAKFEKEKMNAKWDTGENP I Y 
KSAVTTWNPKYEGK 


6078 


1426 


180 


ETEDVMELLEEDLTCPICCSLFDDPRVLPCSHNFCKKCLEGILE 
GSVRNSLWRPVPFKCPTCRKKTFSYWELIPLQVNYSLKGIVEKY 
NKIKISPKMPVCKGH\LGQPLNIF\CL\TDMQLDL/CGIC\ATR 
GEHTKHVFCSIEDAYAQERDAFESLFQSFETWRRGDALSRLDTIj 
ETS KRKS LQLLTKDSDKVKEFFEKLQHTLDQKKNE 1 LSDFETMK 
LAVMQAYDPE INKLNTI LQEQRMAFNIAEAFKDVSE P I VFLQQM 
QEFREKIKVIKETPLPPSNLPASPI^MKNFDTSQWEDIKLVDVDK 
LSL PQDTGTFI 5 KI P WS FYKLFLL I LLLGLVI VFGPTMFLE WSL 
FDDLATWKGCLSNFSS YLTKTADFI EQSVFYWEQVTDGFFI FNE 
RFKNFTLWLNNVAEFVCKYKLL 


£079 


1586 


141 


ATARD LGCARR I DR WM ES TPS RGLNRVHLQCRNLQE F LGGLS P 
GVLDRLYGHPATCLAVFRELPSLAKNWVMRMLFLEQPLPQAAVA 
L WVKKE FS KAQEE STG LLSGLR I WHTQLL PGG LQGL I LNP I FRQ 
NLRIALLGGGKAWSDDTSQLGPDKHARDVPSLDKYAEERWEWL 
HFMVGSPSAAVSQDLAQLLSQAGLMKSTEPGEPPCITSAGFQFL 
LLDTPAQLW YFMLQ YLQTAQSRGMDLVE ILS FLFQLS FSTLGKD 
YS VEGMSDSLLNFLQHLRE FGLVFQRKRKS RR YYPT/RALAI NL 
SSGVSGAGGTVHQPGFIV\VETNYRLYAYTESELQIALIALFSE 
MLYPFP\NMVV\ARVTR\ESVQQAIASGITAQQIIHFLRTRAHP 
VMLKQTP VLPPTITDQ I RLWELERDRLRFTEGVLYNQFLSQVDF 
ELL \ LAHAP KLGVLVFE /NTPAKRLMWTPAGHSDVKRFWKRQK 
HSS 


6080 


1 


1199 


IET I DHVGEFAMAAQAAGVS RQRAATQGLG SNQNALKYLG QD FK 
TLRQQCLDSGVLFKDPE FPACPSALG YKDLGPGS PQTQG 1 1 WKR 
PTELCPSPQFIVGGATRTDICQGGLGDCWLLAAIASLTLNEELL 
YRWPRDQDFQENYAG I FHFQPLCP PS P \ FWQYGE WVEWI DDR 
LPTKNGQLLFLHSEQGNE FWSALLE KAYAKLNGCYEALAGGSTV 
EGFEDFTGGISEFYDLKKPPANLYQI IRKALCAGSLLGCS IDVY 
SAAEAEAITSQKLVKSHAYSVTGVEEVNFQGHPEKLIRLRNPWG 
EVEWSGAWSDDAPEWNHIDPRRKEELDKKVEDGEFWMSLSDFVR 
QFSRLE I CNLS PDSLSSEEVHKWNLVLFNGHWTRGSTAGGCQNY 
PGSS 


6081 


3 


865 


EMLPLLLPLPLLWA/GALAQDARFRLEMPESVTVQEGLCIFVHC 
SVFYLEYGWKDSTPAYGHWFREGVSVDQETPVATNNSTQKVQKE 
TQGR FHLLGDP S RNNCS LS I RDARRRDNGS YFFWVARG RTKFS Y 
KYSPLSVYVTALTHRPDILIPEFLKSGHPSNLTCSVPWVCEQGT 
PPIFSWMSAAPTSLGPRTLHSSVLTIIPRPQDHGTNLICQVTFP 
G AGVTT E RT I QLS VS W KS GTVE EWVLAVG WAVKI LLLCLCL I 
I LS FHKKKAVRAVEVE ENVYAVMG 


6082 


283 


1288 


EARSPGPTQTRTAPGLAAPGLAQPAALRLLLSRPPSAAMDGDGD 
PESVGQPEEASPEEQPEEASAEEERPEDQQEEEAAAAA\Y\LDE 
LPEPLLA/LRVLAALPRHE \ LVQACR\LVCLRWKELVDGAPLWL 
LKCOQEGLVPEGGVEEERDHWQQFYFLSKRRRNLLRNPCGEEDL 
EGWCDVEHGGDGWRVEELPGDSGVEFTHDESVKKYFASSFEWCR 
KAQVIDLQAEGYWEELLDTTQPAIWKDWYSGRSDAGCLYELTV 

fcllsehenvlaefssgovavpqdsdgggwmeishtftdygpgvr 
fwfehggqdsvywkgwfgarvtnssvwve? 


6083 


1865 


303 


kqwcaerrglgmsladelladleeaaeeeeggsygeeeeepaie 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H^Histidine, I=Isoleucine # K»Lysine, 
L=Leucine, M-Methionine , N^Asparagine, 
P-Proline, Q*=Glutaraine , R=Arginine, 
S-Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DVOEETOLDT^GDSVKTIAKIiWDQKMFAFTMMTC TPPYT qKn A 
S EVMG P VEAAPE YRVI VDANNLTVE I ENELNI IHKFI RDKYS KR 
FPELES LVPNALDY I RTVKELGNSLDKCKNNENLQQI LTNAT IM 
WS VTAS TTQGQQLS EEELERLEEACDMALELNASKHRI YE YVE 
S RMS F IAPNLS III GAS TAAK I MGVAGGLTNLS KM P ACN I MLLG 
AQRKTLSGFSSTSVLPHTGYIYHSDIVQSLPPIPPPFSVAP\DL 
RRKAARLVAAKCTLAARVDSFHESTEGKVGYELKDEIERKFDKW 
QEPPPVKQVKPLPAPLDGQRKKRGGRRYRKMKERLGLTEIR\KQ 
ANRMSFGEIEEDAYQEDLGFSLGHLGKSGSGRVRQTQVNEATKA 
R I S KTLQRTLQ KQS WYGGKS T I RDRS SGTAS S VAFT PLQGLE I 
VNPQAAE KKVAEANQKYFSSMAEFLKVKGE KSGLMST 


6084 


1865 


309 


KQWCAERRGLGMSLADELLADLEEAAEEEEGGS YGEEEEE PAI E 

pLTr/NTTCTVlT PIT OHPlCiriTT'T & VT UnOVMl?ai7TMIJVTDPVTOvr\7W» 

u V KJtUCi lUltULbbUc VR.1 J. AJxLiW Ufa NMr A1S 1 NH&J. hhllbKyAKA 

SEVMGPVEAAPE YRVI VDANNLTVE IEKELNI IHKFIRDKYSKR 
F P E LES LVPNALDY I RTVKELGNS LD KCKNNENLQQ I LTNAT I M 
WSVTASTTQGQQLSEEELERLEEACDMALELNASKHRIYEYVE 
S RMS F IAPNLS 1 1 IGASTAAKI MGVAGGLTNLS KMPACNI MLLG 
AQRKTLSGFSSTSVLPHTGYIYHSDIVQSLPPIPPPFSVAP\DL 
RRKAARLVAAKCTLAARVDSFHESTEGKVGYELKDEIERKFDKW 
QEPPPVKQVKPLPAPLDGQRKKRGGRRYRKMKERLGLTEIR\KQ 
ANRMSFGEIEEDAYQEDLGFSLGHLGKSGSGRVRQTQVNEATKA 
R I S KTLQRTLQKQS WYGGKSTI RDRS SGTAS SVAFTPLOGLE I 
VNPQAAE KKVAEANQKYFSSMAEFLKVKGE KSGLMST 


6085 


2 

- 


1456 


SGPRSFQGNRAVGRISLGGKRNPEVTLLPGVSSERVRRWRRARV 

ljV>U<V IviAalNiJWKjror'.A iy VrK/ V-rAU V lJjr'vjK\jir r LiKnG&ELiVM 
DEEAYVL YHRAQTGAPCLS FD I VRDHLGDNRTELPLTL YLCAGT 
QAESAQSNRLMMLRMHNLHGTKPPPSEGSDEEEEEEDEBDEEER 
KPQLEIiAWPHYGGINRVRVSWLGEEPVAGVWSEKGQVEVFALR 
RLLQ WEEPQALAAFLRDEQAQMKP I FS FAGHMGEGFALDW S PR 
VTG RLLTGDCQ KN I HLWT P TDGGS WHVDQRPFVGHTRS VE D LQW 
S P TENTVFAS CS ADAS I R I WD I RAAPS XACML TTATAHDGD VNV 
I SWSRREPFLLSGGDDGALKI WDLRQ FKSGSP VATFKQHVAPVT 
SVEWHPQDSGVFAASGADHQITQWDLG/IVERDPEAGDVEADPG 

ISV 


6086 


2419 


1357 


G AATQHGG AMNLLP CNPHGNG LLY AG FNQDHGC FACGMENG FR V 
YNTDPLKE KEKQE FLEGG VGHVEMLFRCNYLALVGGGKKP KYPP 
N KVM I WDD LKKKT V I EI E F STE VKAVKLRR \DK I VWLDSM I KV 
FTFTHNP \ HQLHV FE \TC YNP KGLCVLCPNSNNS LLAFPGTHTG 
HVQLVD1ASTEKPPVDIPAHEGVLSCIALNLOGTRIATASEKGT 
LIRIFDTSSGHLIQELRRGSQAANIYCINFNQDASLICVSSDHG 
TVHIFAAEDPKRNKQSSLASAS FLPKYFSSKWSFSKFQVPSGS P 
CI CAFGTE PNAVIAI CADGS YYKFLFNPKGECIRDVYAQFLEMT 
DDKL 


6087 


476 


1877 


LVAVI YLVS I WAVPLCVWELQKLEVG I HTKAWFIAGI FLLLTI 
PISLWVILQHLVHYTQPELQKPIIRILWMVPIYSLDSWIALKYP 
GI AI YVDTCRECYEAYVI YNFMGFLTNYLTNRYPNLVL I LEAKD 
QQKHFP PLCCCPPWAMGE VLLFRCKLGVLQ YTWRP FTTI VAL I 
CELLG I YDEGNFS FSNAWTYLVI INNMSQL FAMYCLLLFYKVLK 
E E LS P I Q P VG KFLCVKLVVFVS FWQAW I ALL VKVGV I S EKHT W 
EWQTVEAVATGLQDFIICIEMFLAAIA\HHYTFSYKPYVQEAEE 
GSCFDSFLAMWDVSDIRDDISEQVRHVGRTVRGHPRKKLFPEDQ 
DQNEHTS LLS S SSQDAI S I AS SMPPS PMGHYQGFGHTVTPQTTP 
TTAKISDEILSDTIGEKKEPSDKSVDS 


6088 


1684 


689 


GASGLVRLLQQGHRCLLAP VAP KLVP PVRGVKKGFRAAFRFQKE 
LERQRLLRCP PPPVRRSEKPNWDYHAEIQAFGHRLQENFS LDLL 
KTAFVNS C YI KS EEAKRQQLG I EKEAVLLNLKSNQELS EQGTS F 
SQTCLTQFLEDEYPDMPTEGIKNLVDFLTGEEVVCHVARNLAVE 
QLTLSEEFPVPPAVLQQTFFAVIGALLQSSGPERTALFIRDFLI 
TQMTGKEL FEMWKI INPMGLLVEELKKRNVS APESRLTRQSG\A 
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(A=Alanine, C=Cysteine / D»Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q«Glutamine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








PTALPLYFVGLYCDKKLIAEGPGETVLVAEEEAARVALRKLYGF 
TENRRP WN YS KP KETLRAEKS I TAS 


6089 


3 


3054 


TRLGIPGSTISSRPRLC^LAAEGHFLGHSWTGSRAGAHTGAPAW 
PS RRLRDLPAGGMWRLRRAAVACEVCQS LVKHSSG I KGSLPLQK 
LHLVSRS I YHSHHPTLKLQRPQLRTSFQQFS SLTNLPLRKLKFS 
P I KYGYQPRRNFWPARLATRLLKLRYLI LGS AVGGG YTAIOCTFD 
QWKDMIPDLSEYKWIVPDIVWEIDEYIDFEKIRKALPSSEDLVK 
LAPDFDKIVESLSLLKDFFTSGSPEETAFRATDRGSESDKHFRK 
VSDKEKIDQLQEELLHTQLKYQRILERLEKENKELRKLVLQKDD 
KG I P FI ESLRKS LI DM YS EVLDVLS DYDAS YNTQDHLPR WWG 
DQSAGKTSVLEMIAQARIFPRGSGEMMTRSPVKVTLSEGPHHVA 
L FKDSSRE FDLTKEEDLAALRHE I ELRMRKNVKEGCTVS PET I S 
LNVKGPGLQRMVLVDLPG VINTVTSGMAPDTKETI FS I SKAYMQ 
DPNAI ILCIQDGSVDAERS IVTDLVSQMDPHGRRTIFVLTKVDL 
AE KNVAS PS R I QQI I EG KLFPMKALG YFAWTGKGNS SESIEAI 
R E YE E EFFQNS KLLKTS M LKAHQ VTTRNLS LAVS DCFWKMVRE S 
VEQQADS FKATRFNLETEWKNNYPRLRELDRNELFEKAKNE I LD 
EVISLSQVTPKHWEEILQQSLWERVSTHVIENIYLPAAQTMNSG 
TFNTTVDIKLKQWTDKQLPNKAVEVAWETLQEEFSRFMTEPKGK 
EHDDI FDKLKEAVKEKS I KRHKWNDFAEDSLRVIQHNALEDRS I 
SDKQQWDAAI YFMEEALQARLKDTENAI ENMVGPD\WKKRWLYW 
KNRTQEG<rVHNETKNELEKMLKCNEEHPAYLASDEITTVRKNLE 
SRGVEVDPSL I KDTWHQVYRRHFLKTALNHCNLCRRGFYYYQRH 
F VD S E LECNDWL FWR I QRMLAI TANTLRQQ LTNTE VRRLE KNV 
KEVLEDFAEIX5EKKIKLLTGKRVQLAEDLKKVREIQEKLDAFIE 
ALHQEK 


6090 


194 


1560 


PVFVPAPGAVLEQAS / AS P PLATQT WPLQHCKI PELP VQAS I L 
FELQLFFCQLIALFVHYINIYKTVWWYPPSHPPSHTSLNFHLID 
FNLLMVTTIVLGRRFIGS I VKEASQRGKVSLFRS I LLFLTRFTV 
LTATGWS LCRSL IHLFRTYS FLNLL/FPLLS VWDVHS VPAAELR 
P\RKTSLFNHMASMGPREAVSGLAKSRDYLLTLR\RRGSSTQDS 
CHART P CP/ PHACCLS P S L I RS EVE FLKMDFNWRMKE VL VS SM L 
S AY YVAFVPVWFVKNTHY YDKRWSCELFLLVS ISTS VILMQHLL 
PASYCDLrjHKAAAHLGCWQKVDPALCSNVLQHPWTEECMWPQGV 
LVKHSKNVYKAVGHYNVAI PSDVSHFRFHFFFS KPLRILNI LLL 
LEG AV I VYQL YS LMS S E K WHQT I SLAL I L FSNYYAF FKLLRDRL 
VLGKAYS YSAS PQRDLDHRFS 


6091 


3279 


412 


SSRTREMEEKEILRRQIRLLQGLIDDYKTLHGNAPAPGTPAASG 
WQPPTYHSGRAFSARYPRPSRRGYSSHHGPSWRKKYSLVNRPPG 
PSDP PADHAVRPLHGARGGQ P P VPQQHVLERQVQLSQGQNWI K 
VKP P S KS GS ASASGAQRGS LEE F EDTPWS DQRPREGEGEPPRGQ 
LQPSRPTRARGTCSVEDPLLVCQKEPGKPRMVKSVGSVGDSPRE 
PRRTVS ES VI AVKAS FPSS ALP PRTGVALGRKLGSHS VASCAPQ 
LLGDRRVD AGHTDQ P VPSG S VGG P ARP AS GPRQ AREAS LWTCR 
TNKFRKNNYKWVAASSKSPRVARP>ALSPRVAAENVCKASAGMAN 
KVEKPQL I ADPEPKPRKPATS S KPGSAPS KYKWKAS S P S ASS SS 
SFRWQSEAGSKDHASQLSPVLSRSPSGD\RPAVGHSGLKPLSGE 
TPLSAYKVKSRTKIIRRRGSTSLPGDKKSGTSPAATAKSHLSLR 
RRQALRGKSSPVLKKTPNKGLVQVTTHRL(2RLPPSRAHLPTKEA 
SSLHAVRTAPTSKVIKTRYRIVKKTPASPLSAPPFPLSLPSWRA 

PPT.ClTtQB t3TiVT.hTDT.P R QCnf2Y"bDD(~!.cZ DTxTUID C fcT'V'D H TPPT rr v 
u o Do t\ a u v JjIN JvuK r V JVs Kj\j<j Irooi'WWKblxu IKk,X\i\jVijI 

KVSANKLSKTSGQPSDAGSRPLLRTGRLPPAGSCSRSliASRAVQ 
RSLAI IRQARQRREKRKEYCMYYNRFGRCNRGERCPYIHDPEKV 
AVCTRF VRGTCKKTDGTCPFSHHVS KEKMPVCS YFLKG I CSNSN 
CPYSHVYVSRKAEVCSDFLKGYCPLGAKCK3CKHTLLCPDFARRG 
ACPRGAOCQLLHRTQKRHS RRAATS PAPGPSDATARS R VS ASHG 
PRKPSASQRPTRQTPSSAALTAAAVAAPPHCPGGSAS PSSS KAS 
SSSSSSSSPPASLDHEAPSLQEAALAAACSNRLCKLPSFISLQS 
S PSPGAQPRVRAPRAPLTKDSGKPLHIKPRL 


6092 


143 


3190 


AKAP PTGES S E PEAKVLHTKRLYRAWEAVHRLDLILCNKTAYQ 
EVFKPENISLRNKLRELCVKLKFLHPVDYGRKAEELLWRKVYYE 
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Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R*Arginine, 
S=Serine, T«Threonine, V* Valine, 
W=Tryptophan, Y*Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VIQLIKTNKKHIHSRSTLECAYRTHLVAGIGFYQHLLLYIQSHY 
QLELQCCIDWTHVTDPLIGCKKPVSASGKEMDWAQMACHRCLVY 
LGDIjSRYQNELAGVDTELLiAERFYYQALSVAPQIGMPFNQW3TL 
AGSKYYNVEAMYCYLRCIQSEVSFEGAYGNLKRLYDKAAKMYHQ 
LKKCETRKLSPGKKRCKDIKRLLVNFMYLQSLLQPKSSSVDSEL 
TSLCQSVLEDFNLCLFYLPSSPNLSLASEDEEEYESGYAFLPDL 
LI FQMVI I CLMCVHSLERAGSKQYSAAI AFTLALFSHLVNHVNI 
RLQAELEEGENPVPAFQSDGTDEPESKEPVEKEEEPDPEPPPVT 
PQVGEGRKSRKFSRLSCLRRRRHPPKVGDDSDIiSEGFESDSSHD 
SARASEGSDSGSDKSLEGGGTAFDAETDSEMNSQESRSDLEDME 
EEEGTRSPTLEPPRGRSEAPDSLNGPLGPSEASIASNLQAMSTQ 
MFQTKRCFRLAPTFSNLLLQPTTNPHTSASHRPCVNGDVDKPSE 
PASEEGSESEGSESSGRSCRNERSIQEKLQVLMAEGLLPAVKVF 
LDWLRTNPDLIIVCAQSSQSLWNRLSVLLNLLPAAGELQESGLA 
LCPEVQDLLEGCELPDLPSSLLLPEDMALRNLPPLRAAHRRFNF 
DTDRPLLS TLEE SWRI C C I RS FGHF I ARLQG S I LQ FNP E VG I F 
VS I AQS E QE S LLQQAQAQ FRMAQE EARRNRLMRDMAQLRLQLE V 
SQLEGSLQQ P KAQ S AMS P YLVP DTQ ALCHHL P V I RQ LAT SGR F I 
VI I PRTVIDGLDLLKKEHPGARDGIRYLEAEFKKGNR YIRCQKE 
VGKSFERHKLKRQDADAVfTLYKILDSCKQLT\LAQGAGEEDPSG 
MVT 1 1 TGLPLDNPS LLSGPMQAALQAAAHAS VDI KNVLDF YKQW 
KEIG 


6093 


76 


1002 


ACGRRAMLALRVART/ SRWGAL \RGAVWAPGTRPSKRRACWALL 
PPVPCCLGCLAERWRLRPAAtiGLRLPGIGQRNHCSGAGKAAPRX 
PAAGAGAAAEAPGGQWGPASTPSLYENPWTIPNMLSMTRIGLAP 
VLGYL 1 1 E E D FN I ALGVFALAG LTD LLDG F I AKN WAN U KbALAjb 
ALDPLADKI LI S I LYVSLTYADLI PVPLTYM I I SRD VMLI AAVF 
YVRYRTLPTPRTLAKYFNPCYATARLKPTFI S KVNTAVQLILVA 
AS LAAP VFN YADS I YLQI LW CFTAFTTAASAYS YYH YGRKTVQV 
IKD 


6094 


23 


1010 


P FLRCLRGDQ KAKMS E R KVLNK Y Y P P D FD P S KI P KLKLPKDRQ Y 
WRLMAPFNMRCKTCGEYIYKGKKFNARKETVQNEVYLGLPIFR 
FYIKCTRCLAEITFKTDPENTDYTMEHGATRNFQAEKLLEEEEK 
RVQKEREDEELNNPMKVLENRTKDSKLEMEVLENLQELKDLNQR 
OAHVDFEAMLRQHRLSEEERRRQQQEEDEQETAALLEEARKRRL 
LEDSDSEDEAAPSPLQPALRPNPTAILDEAPKPKRKVEVWEQSV 
GSLGSRPPLSRLVWKKAKADPDCSNGQPQA/APHPRSPAEQEG 
GQPYTPDAWRVLPEPTGCIPGQ 


6095 


1 


1599 


TRGRAAERSRGRGHGFLGGGFA\SWDYFPSEDFYRCGYCKNES 
GSRSNGMWAHSMTVQDYQDLIDRGWRRSGKYVYKPVMNQTCCPQ 
YTIRCRPLQFQPSKSHKKVLKKMLKFLAKGEVPKGSCE\DEPMD 
STMDDAVAGDFALINKLDIQCDLKTLSDDIKESLESEGKNSKKE 
EPQELLQSQDFVGEKLGSGEPSHS 
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ID 
NO: 
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nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V-Valine, 
W«Tryptophan, Y*Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VKVHTVPKPGKGADLSKPPCRKAKEIRKBRKRLKLMQQNPAGEL 
EGFQAQGHPPSLFPPKAKSNQPKSLEDLIFESLPENASHKLEVR 
WRSSPPSSQFKATLLESYQVYKRYQMVIHKNPPDTPTESQFTR 
FLCS SPLEAETPPNGPDCGYGS FHQQYWLDGKI IAVGVTDI LPN 
CVSSVYLYVDPDYSFLS3^GVYSALREIAFTRQLHEKTSQLSYYY 
MGFYIHSCPKMKYKGQYRPSDLLCPETYVWVPIEQCLPSLENSK 
YCRFNQDPEAVDEDRSTEPDRLQVFHKRAIMPYGVYKKQQKDPS 
EE AAVLQ YAS LVGQKCSERMLLFRN 


6096 


2277 


575 


QRVRAALLS S AMEDSEALG FEHMGLDPRLLQAVTDLGW SRPTL I 
QEKAI PLALEGKDLLARARTGSGKTAAYAI PMLQLLLHRKATGP 
WEQAVRGLVLVPTKELARQAQSMIQQLATYCARDVRVANVSAA 
EDS VSQRAVLMEKPDVWGTPSRI LS HLQQDS LKLRDS LE LLW 
DEADLLFSFGFEEELKSLLCHLPRIYQAFLMSATFNEDVQALKE 
LILHNPVTLKLQESQLPGPDQLQQFQWCETEEDKFLLLYALLK 
LS L I RGKSLLFVNTLERS YRLRLFLEQFS I PTCVLNGE LPLRSR 
CH 1 1 SQFNQGF YDCVIATDAEVLGAP VKGKRRGRGPKGDKASDP 
EAGVARGIDFHHVSAVLNFDLPPTPEAYIHRAGRTARANNPGIV 
LTFVLPTEQFHLGKIEELLSGENRGPILLPYQFRMEEIEGFRYR 
CRDAMRS VTKQA I REARLKE I KEELLHS EKLKTYFEDNPR \ DLQ 
LLRHDLPLHPAWKPHLGHVPDYLVP PALRGLVRPHKK\GRS CL 
PLVGRPREQS PRTHCAASSTKERNSDPQPSPPEWGPLWS 


6097 


1673 


192 


APGTMSGGKKKS S FQ I TS VTTD YEG PGS PGAS DPPTPQPPTGPP 
PRLPNGEPSPDPGGKGTPRNGSPPPGAPSSRFRWKLPHGLGEP 
YRRGRWTCVD VYERDLE PHS FGGL L E G I RGAS GGAGGRS LD S RL 
ELAS LGLGAPTP P S GLS QG PTS WLR P P P TS PG P QARS F TGGLGQ 
LWPSKAKAEKPPLSASSPQQRPPEPETGESAGTSRAATPLPSL 
RVEAEAGGS GART P P LSRR KAVDMRLRME LG APE E MGQ VP P LDS 
RPSSPALYFTHDASLVHKSPDPFGAVAAQKFS LAHSMLAI SGHL 
DSDDDSGSGSLVGIDNKIEQAMDLVKSHLMFAVREEVEVLKEQI 
RELAERNAALEQENGLLRALA\SPEQLGSAGPPRGVPR\LGPPA 
PNGPFVLSLPSLTIVPLGLPGLASAAWPPLPMPALIVPVFPGVG 
VQALS NG P WS PG P L PHLL HPS LDGGG EG FRTGRQQG AP FG E E T 
QPPPSLPGTPQQ 


6098 


168 


1074 


NYCLRHRSPLEKDSSPGSSSTSLLIKKQRETSDTPIMRALKELD 
EGKIFKNWGTQTEKEDTSNINPRQTETSVNASRSPEKCAQQRQK 
RLNSASQRS S S LP P S NRKS STP TKRE I MLTP VTVAYS P KRS P KE 
NLSPGFSHLLSKNESSPIRFDILLDDLDTVPVSTLQRTNPRKQL 
\QFLPLDDSEEK\TYSEKAT\DNIVNHSSCPEPVPNGVKKVSVR 
TAWEKNKSVSYEQCKPVSVTPQGNDFEYTAXIRTLAETERFF\D 
ELTKEKDQIEAALSRMPSPGGRITLQTRLNQEAFGRSFGKD 


6099 


168 


1074 


N Y CLRHRS P LEKD S S PGS S S TS LL I KKQRETS DT P I MRAL KELD 
EGKI FKNWGTQTEKEDTSNI NPRQTETS VNASRS PE KCAQQRQ K 
RLNSASQRSSSLPPSNRKSSTPTKREIMLTPVTVAYSPKRSPKE 

\QFLPLDDSEEK\TYSEKAT\DNIVNHSSCPEPVPNGVKKVSVR 
TAWEKNKSVSYEQCKPVSVTPOGNDFEYTAXIRTLAETERFF\D 
ELTKEKDQ I EAALS RMPS PGGR ITLQTRLNQEAFGRS FGKD 


6100 


2 


713 


FVEVSGYRSRADPEPRGRDTMTYAYLFKYI I IGDTGVGKSCLLL 
Q FTDKRFQPVHDLT IGVE FGARMVN I DGKQI KLQI WDTAGQES F 
RS I TRS YYRGAAGALLVYD ITRRETFNHLTS WLEDARQHS SSNW 
VIMLIGNKSDLESRRDVKREEGEAFARE\HGLIFMETSAKTACN 
VEEAFINTAKEIYRKIQQGLFDVHNEANGIKIGPQQSISTSVGP 
SAS QRNSRD IGSNSGCC 


6101 


1 


1399 


FRG RAWPLRE VS HWLG CRRVC S WS AS WGRL P ALS ARLS P LLAFR 
G KMVF PLS CAVQQ YAWG KMGSNS E VARLLAS S D PLAQ IAE D KP Y 
AE L WMGTHPRGDAK I L DNR I SQ KT LSQW IAE NQDS LG S KVKDT F 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D*=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«=Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








NGNLP FL FKVLS VETP LS I QAH PNKELAEKLHLQAPQH YPDANH 
KPEMAI ALTPFQGLCGFRPVEE I VTFLKKVPE FQFL I GDEAATH 
L KQTMSHDS QAVAS S LQS CFS HLMKS EKKVWEQLNLL VKR I S Q 
QAAAGNNMEDIPGELLLQLHQQYPGDIGCFAIYFLNLLTLKPGE 
AMFLEANVPHAYLKGDCVECMACS DNTVRAGLTPKFI DVPTLCE 
MLSYTPSSSKDRLFLPTRSQEDPYLSIYDPPVPDFTIMKA\EVP 
G\ S VTEYKDLALDSAS ILLMVQGTVIASTPTTQTP I PLQRGGVL 
FIGANESVSLKLTEPKDLLIFRACCLL 


6102 


70 


2415 


QTPQATLAANGAEDSRGGEMLPAGEIGASPAAPCCSESGDERKN 
LEEKSDINVTVLIGS KQVSEGTDNGDLPSYVSAF IEKEVGNDLK 
SLKKLDKLIEQRTVSKMQLEEQVLTISSEIPKRIRSALKNAEES 
KQ FLNQ FLEQETHLFS A I NSHLL TAQP WMDDLGTM I S Q I EE I ER 
HLAYLKW I S Q I EELS DN I QQ YLMTNNVP EAAS TLVS MAE LD I KL 
QE S S CTHLLG FMRATVK FWHK I LKD KLTS DFEE I LAQLH W P F I A 
PPQSQTVGLSRPASAPEIYSYLETLFCQLLICLQTSHELLTEPK\ 
HSQKNTLFLPPLLSS/WPIQVMLTPLQKRFRYHFRGNRQTNVLS 
KPEWYLAQVLMWIGNHTEFLDEKIOPILDKVGSLVNARLEFSRG 
LMMLVLEKLATDIPCLLYDDNLFCHLVDEVLLFERELHSVHGYP 
GTFASCMHILSEETCFQRWLTVERKFALQKMDSMLSSEAAWVSQ 
YKDITDVDEMKVPDCAETFMTLLLVITDRYKNLPTASRKLQFLE 
LQKDLVDDFRIRLTQVMKEETRASLGFRYCAILNAVNYISTVLA 
DWADNVFFLQLQQAALEVFAENNTLSKLQLGQLASMESSVFDDM 
INLLERLKHDMLTRQVDHVFREVKDAAKLYKKERWLSLPSQSEQ 
AVMSLSSSACPLLLTLRDHLLQLEQQLCFSIjEKIFWQMLVEKLD 
VYIYQEIILANHFNEGGAAQLQFDMTRNLFPLFSHYCKRPENYF 
KHIKEACIVLNLNVGSALTAGKDVLPVQLQGSFPAT 


6103 


207 


2523 


ESNSTMTTYLEFIQQNEERDGVRFSWNVWPSSRLEATRMVVPVA ' 

ALFTPLKERPDLPPIQYEPVLCSRTTCRAVLNPLCQVDYRAKLW 

ACNFCYQRNQFPPSYAGISELNQPAELLPQFSSIEYWLRGPQM 

PLIFLYWDTCMEDEDLQALKESMQMSLSIiLP PTALVGL I TFGR 

MVQVHELGCEGISKSYVFRGTKDLSAKQLQEMLGLSKVPVTQAT 

RGPQVQQPPPSNRFLQPVQKIDMNLTDLLGELQRDPWPVPQGKR 

PLRSSGVALS I AVGLLECTFPNTGAR IMMF I GGPATQG PGMWG 

DELKTP I RS WHD I DKDNAKYVKKGTKHFEALANRAATTGHV I D I 

YACALDQTGLLEMKCCPNLTGGYMVMGDSFNTSLFKQTFQRVFT 

KDMHGQFKMGFGGTIjEIKTPR\EIKISGAIGPCVSLNSKGPCVS 

ENEIGTGGTCQWKICX5LSPTTTLAIYFEWNQHNAPIPQGG\RG 

A\ IQFVTQY \ QHS SGQRR I RVTT I ARN\ WADAQTQ IQNI AAS FD 

QEAAAILMARLAIYRAETEEGPDVLRWLDRQLIRLCQKFGEYHK 

DDPSSFRFSETFSIiYPQFMFHLRRSSFLQVFNNSPDESSYYRHH 

FMRQDLTQSLIMIQPILYAYSFSGPPEPVLLDSSSILADRILLM 

DTFFQILIYHGETIAQWRKSGYQDMPEYENFRHLLQAPVDDAQE 

I LHSRFPMPRYIDTEHGGSQARFLLS KVNPSQTHNNMYAWGQES 

GAP I LTDDVS LQVFMDHLKKLAVSS AA 


6104 


124 


732 


KVSEYIIliSKDKI LFHALAM LVLWS PWSAARG VLRNYWERLLR 
KLPQSRPGFPSPPWGPALAVQ\AQPCLQSQQMIPVEVKRI/RSL 
LDSIFWMAAPKNRRTIEWRCRRRNPQKLIKVKNNIDVCPECGH 
LKQKHVLCAYCYEKVCKETAEIRRQIGKQEGGPFKAPTIETWL 
YTGETPS EQDQGKR 1 1 ERDRKRPSWFTQN 


6105 


3 


989 


PLHGACTS LVLQRFCHRRPRPCAPARP EDMRR PAAVPLLLLLCF 
GS QRAKAATACGRPRMLNRMVGGQDTQEGEWPWQVS I QRNGSHF 
CGGS L I AEQ WVLTAAHCFRNTS ETS L YQ VLLGARQL VQ PG PHAM 
YARVRQVESNPLYQGTASSADVALVELEAPVPFTNYILPVCLPD 
PSVIFETGMNCWVTGWGSPSEEDLLPEPRILQKLAVPIIDT\PR 
CNLLYSKDTEFGYQPKTIKNDMLCAGFEEGKKDACKGDSAGPLV 
CLVGQSWLQAGVISWGEGCARQNRPGVYIRVTAHHNWIHRIIPK 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, b«Aspartic Acid, E= 
Glutamic Acid, F» Phenyl al anine , G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Prol ine , Q^Glutamine , R=Arginine , 
SsSerine, T=Threonine, V=Valine, 
W*Tryptophan, Y*Tyrosine, X=Unknown, *-Stop 
Codon, Apossible nucleotide deletion, 
\-possible nucleotide insertion) 








LQVQPSEVGRPEVTPPGPGAP 


6106 


3 


1302 


GRP PTAPHTGR P PTANRGDPRLDLKRGCARLLTS I ESRGRPAAS 
AGLRRDRCALRRWPLRRAPLARATRRRAGS PRRCAPRPRACPQG 
WSRARHQPGGLCLLLLLLCQFMEDRSAQAGNCWLRQAKNGRCQV 
LYKTELSKEECCSTGRLSTSWTEEDVNDNTLFKWMIFNGGAPNC 
IPCKETCEKVDCGPGKKCRMNKKNKPRCVCAPDCSNITWKGPVC 
GLDGKTYRNECALLKARCKEQPELEVQYQGRCKKTCRDVFCPGS 
S TCV\ VDQTNNAYCVTCNRI C P E P AS SEQ YLCGNDGVT YS \ SAC 
HLRKATCLLGRS IGLAYEGKCI KAXSCEDIQCTGGKKCLWDFKV 
GRGRCSLCDELCPDSKSDEPVCASDNATYASECAMKEAACSSGV 
LLEVKHSGSCNS I SEDTEEEEEDEDQDYSFPISS ILEW 


6107 


623 


168 


SRCSS PRPBPGRGRGK/ LS PSEHRKWVEVFKACDEDHKGYIjSRE 
DFKTAVVMLFGYKPSKIEVDSVMSSINPNTSGILLEGFLNIVRK 
KKEAQR YRNE VRH I FTAFDTYYRGFLTLEDFKKAFRQVAPKLPE 
RTVLEVFREV\DRDS\DGHVSF 


6108 


3 


1348 


GGSLRFSPPRVPS.CSRVFCPVP PGGCGLPS PMSASRPQS PTTPW 
CLPRRYMKHKRDDGPEKQEDEAVDVTPVMTCVFWMCCSMLVLL 
YYFYDLLVYW I G I FCLAS ATGLYS CLAPCVRRLP \ SAS AGES A 
LLAPTIPNNSLPYFHKRPQARMLLLALFCVAVSVVWGVFRNEDQ 
WAWVLQDALG I AFCLYMLKT I RLPTFKACTLLLLVLFLYD I FFV 
FITPFLTKSGS S IMVEVATGPSDSATREKLPMVLKVPRLNSSPL 
ALCDRPFSLLGFGDILVPGLLVAYCHRFDIQVQSSRVYFVACTI 
AYGVGLLVTFVALALMQRGQPALLYLVPCTLVTSCAVALWRREL 
GVFWTGSGFAKVLPPSPWAPAPADGPQPPKDSATPLSPQPPSEE 
PATSPWPAEQSPKSRTSEEMGAGAPMREPGSPAESEGRDQAQPS 
PVTQPGASA 


6109 


1 


1381 


CRSRAGAASGGAILEGTKLRRQRVDTNKPLDPLVPSALRAAMIiY 
LED YLEM I EQL PMD LRDR FTEMREMDLQ VQNAMDQLEQR VSE FF 
MNAKKNKPEWREEQMAS I KKD YYKALEDADEKVQLANQI YDLVD 
RHLRKLDQEIiAKFKMELEADNAGITE ILERRSLELDTPSQPVNN 
HHAHSHTPVEKRKYNPTSHHTTTDHI PEKKFKSEALLSTLTSDA 
SKENTLGCRNNNSTASSNNAYNVNSSQPLGSYNIGSLSSGTGAG 
GI\TMAAAQAVQATAQMKEGRRTSSLKASYEAFKNNDFQLGKEF 
SMARETVG YS S S S ALMTTLTQNAS S S AADSRSGRKS KNNNKS S S 
QQSSSSSSSSSLSSGSSSSTWQEISQQTTWPESDSNSQVDWT 
YDPNEPRYCICNQVSYGEMVGCDTQDCPIEWFHYGCVGLTEAPK 
GKWYCPQCT\AAMKRRGSRHK 


6110 


77 


2464 

I 

i 


ACPSAATMS DQDHS MDEMTAWKIE KGVGGNNGGNGNGGGAFSQ 
ARSSSTGSSSSTGGGGQESQPS PLALLAATCSRI ES PNENSNNS 
QGPSQSGGTGELDLTATQLSQGANGWQIISSSSGATPTSKEQSG 
SSTNGSNGSESSKNRWSGGQYWAAAPNLQNQQVLTGLPGVMP 
NIQYQVI PQFQTVDGQQLQFAATGAQVQQDGSGQIQI I PGANQQ 
I ITNRGSGGNI IAAMPNLLO^AVPLQGLANNVLSGQTQYVTNVP 
VALNGNITLLPVNSVSAATLTPSSQAVTISSSGSQESGSQPVTS 
GTTI S SAS L VS SQAS S S S F FTNANS YS TTTTTSNMG I MNFTTS G 
S SGTNS QGQT PQR VS GLQGS DALNI QQNQTS GGS LQAGQQKEGE 
Q\NQQTQAAPKSLSRPQLVQGG\QALQ\AFQAAPLSGQTFTTQA 
I SQETLQNLQLQAVPNSGPI 1 1 RTPTVGPHGQVS WQTLQLQNLQ 
VQNPQAQT I TLAPMQG VS LGQTS S SNTTLTP IASAAS I PAGTVT 
VNAAQLS S M PG LQT I NLS ALGTSG I Q VH P I QGLPLA I ANAPGDH 
GAQLGLHGAGGDG IHDDTAGGEEGENS PDAQPQAGRRTRREACT 
CPYCKDS EGRGSGDPGKKKQHI CH I QGCGKVYGKTSHLRAHLRW 
HTGERPFMCTWSYCGKRFTRSDELQRHKRTHTGEKKFACPECPK 
RFMRSDHLS KH I KTHQNKKGGPGVALS VGTLPLDSGAGSEGSGT 
ATPSALITTNMVAMEAICPEGIARLANSGINVKEGGQFCSPINT 
SANGF 
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NO: 
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location 
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Predicted end 
nucleotide 
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corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl a 1 ani ne , G=Glycine, 
H=Histidine, I^Isoleucine, KssLysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, ^Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 


6111 


1*37 


797 


RVDPRVRGAMAPWGKRLAGVRGVLLDISGVLYDSGAGGGTAIAG 
SVEAVARLKRSRLKVRFCTNESQKSRAELVGQLQRLGFDISEQE 
VTAPAPAACQILKERGLRPYIiLIHDGV\ASEFDQIDTS/STPNC 
WIADAGESFSYQNMNNAFQVLMELEKPVLISLGKGRYYKETSG 
LMLDVGPYMKALEYACGI KAEVGGKPSPEFFKSALQAIGVEAHQ 
AVMIGDDIVGDVGGAQRCGMRALQVRTGKFRPSDEHHPEVKADG 
Y VDNLAEAVD LLLQHAD K 


6112 


77 


196 


MS S HKS FKS KRFLAKKQKPNRPI LOW I WLKTGNKI RHNWK 


6113 


1779 


567 


V3EGRS WAACGVNLQGAWGERS GVRASEAES PGKRADVS WWSRQL 
ETMVDHLANTE INSQR I AAVESCFGASGQPLALPGRVLLGEGVL 
TKECRKKAKPRI FFLFNDILVYGS I VLNKRKYRSQHI I PLEEVT 
LELLP ETLQAKNR WM I KTAKKS FWS AASATERQEW ISHI EEC V 
RRQ LRATGRPA\ S TEHAAPW I PDKATD I CMR CTQTRFSALTRRH 
HCRKCRVWCAECSRQRFLLPRLSPKPVRVCSLCYRELAAQQRK 
EEAEEQGAGVPRAASHLARP I CGRPVEMTMTPTRTRRAAGTATG 
PAAWSSTPRGWPGLPSTADPRPAEHLSPSQLHCPGPQEGSSRSC 
PGLRD P I P WWQVQRWGVAL SG LPV P FCWTLC P YGFTAGNAFP FR 
KPQNTHRSW 


6114 


818 


246 


PTSRPRPSPGSPAMSWSACVSAAPSSSWPASSSWPCGPRRCCTR 
RRRCSPRCGLAAGSMCSCSPSWRCTPVPACWPSPPP\PAEQVQC 
GHL P PHADRRALRLP VAAP ARG PG PGHP AGP AG PR PARTP PAS P 
HGPGRPTVPAPPCPLLAATEPTPSRPHQRWTREDRMLGRGSQVT 
GRPQWFLRGLVLFSL 


6115 


324 


71 


DVCGRVCAHPHLYTH IHMH I CAHAC \ I HTHAQLC/ I TASHALAH 
SHLYTCMVMLTASHTPSHTHPHTAVHKEHRADVLRGTLTPLR 


6116 


595 


1430 


TG VM P PGRWHAA / 1 SSSGPVFEGARA\LQTVKKEEEDESYTPVQ 
AARPQTLNRPGQELFRQLFRQLRYHESSGPLETLSRLRELCRWW 
LRPDVLSKAQILELLVLEQFLSILPGELRVWVQLHNPESGEEVL 
WPCWRS CRGTLMGHPGGTRALP \ EPRCALDGYRS \LRSAQI WS L 
ASPLRSSSALGDHLEPPYE I EARDFIiAGQSDTPAAQMPALFPRE 
GC PGDQ VTPTRSLTAQLQ E TMT FKD VE VT FS QDE WG WLDS AQRN 
LYRDVMLENYRNMASLGK 


6117 


1433 


222 


VGVPS PAPPCSWBVGPGGGWTPGI LKEGOGGRRTPLLLLATRTR 
GLLSL F P PAAMHP AAF P LPVWAAVL WGAAP TRGLI RATS DHNA 
SMDFADLPALFGATLS QEGLQGFLVEAHPDNACS P I APP PPAPV 
NGSVF I ALLRRFDCN FDLKVLNAQ KAG YGAAWHNVNSNE LLNM 
VWNSEBIQOQI WI PS VFI GERS S E YLRALFVYEKGAR VLLVPDN 
TFPLGYYLI P FTG I VG LLVLAMGAVM IARC IQHRKRLQRNRLTK 
\EQLKQ I \ PTHDYQKGDQYDVCAI CLDE YEDGDKLRVLPCAHAY 
HSRCVDPWLTQTRKTCPICKQPVHRGPGDEDQEEETQGQEEGDE 
GEPRDHPASERTPLLGSSPTLPTSFGSLAPAPLVFPGPSTDPPL 
SPPSSPVILV 


6118 


1044 


247 


ST I S CRACTSGATPGAQSHRS ARGHAAGGKETAALGMERGKVKK 
KEKEKETQKEKIGEKGREEKVKRKEVEQKIKQEKQEKQERRKGK 
EKEEKRTKQGKETNKEKEQFKGQEEKGENKDSTLTRTPLEPLEK 
NKQILVLGLDGAGKTSVLHSLASNRVQHSVAPTQGFHAVCINTE 
DSQMEFLEIGGSKPFRSYWEMYLSN/ADSLARSFSVGFKQDSQP 
ITWKAKKYLHQLIAANPVLPLWFANKQDLEAAYHITDIHEALA 
II 


£119 


1217 


462 


DPRFVTENTTKAPAQERTTQPRSSREGTLRSTMEYLSALNPSDL 
LRSVSNISSEFGRRVWTSAPPPQRPFRVCDHKRTIRKGLTAATR 
QELLAKALETLLLNGVLTLVLEEDGTAVDSEDFFQLLEDDTCLM 
VLQSGQSWSPTRSGVLSYGLGRERPKHSKDIARFTFDVYKQNPR 
D L FG S LNVKATFYGLYS MS CD FQGL \G PKKVLRE LLRWT S TLLQ 
GLGHMLLG I SSTLRHAVEGAEQWQQKGRLHSY 
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SEQ 
ID 
NO : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 

f3l n(-ami n Iri H C" — Phpnul al an 1 n*» CI— fil vri n ^ 
vjiULanilu iu f r — riiviiy xaxaiiinc f \j — \yj.y\ — luc ( 

H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, [^Methionine, N=Asparagine, 
P=Proline, 0>Glutamine, R-Arginine, 
S=Serine, T-Threonine , V«Valine, 
W-Tryptophan, Y-Tyrosine, X«Unknown, *=*Stop 
Codon, /"possible nucleotide deletion, 
\=possible nucleotide insertion) 


6120 


785 


179 


LE RAGGGGLS S RAL VG SGACLS LVARANG KG L PRGRKE FVEAVR 
VRYVAFRYRTPRAVCLRLWSCRREVIMSGRGKQGGKVRAKAKSR 
S S RAG LQ F P VGRVHRLLRKGNYAER VGAGAP VYLAAVLE YLTAE 
ILELAGNAARDNKKTRI IPRHLQLAIRNDEELNKLLGKVTIAQG 
G \ VLPN I Q AVLL P KKTB £>Q KUb(j AND f 


6121 


1612 


107 


FVRAQARGS RQP VRRP LLGAGSRLRCRS CGRMEPLKVEKFATAN 
RGNGLRAVTPLRPG ELL FRSD PLAYTVCKGS RGWCDRCLLGKE 
KLMRCSQCR VAKYCS AKCQKKAW PDHKRECKCLKS CKPRYPPDS 
VRLLGRWFKLMDGAPS ESE KL YS F YDLESN INKLTEDKKEGLR 
QL VMTFQH FMREE I QDAS QL P P AF DL FE AFAKVI CNS FT I CKAE 
MQEVGVGL YPS I SLLNHS CDPNCS I VFNG PHLLLRAVRD IEVGE 
ELTICYLDMLMTSEERRKQLRDQYCFECD\CFRCQTQDKDADML 
TGDEQVWKEVQESLKKI EELKAHWKWEQVLAMCQAI ISSNSERL 
PDINIYQLKVLDCAMDACINLGLLEEALFYGTRTMBPYRIFFPG 
SHP VRGVQ VM KVGKLQ LHQGM F PQ AMKNLRLAFD IMRVTHGREH 
SLIEDLILLLE/AMRRQHQSILRERSQREIRRVSLLNALLRSHT 
LCFVSCVNLSYWKFCSVFV 


6122 


2 


2324 


RFRKMADGGAASQDESSAAAAAAADSRMNNPSETSKPSMESGDG 
NTG TQTNG LD FQ KQ P V P VGGAI S TAQAQAFLGHLHQVQLAGTSL 
QAAAQSLNVQS KSNEE S GDS QQPSQP SQQPS VQAAIPQTQLMLA 
GGQ I TGLTLTPAQQQLLLQQAQAQAQLLAAAVQQHSASQQHSAA 
GATISASAATPMTQIPLSQPIQIAQDLQQLQQLQQQNLNLQQFV 
LVH PTTNLQ PA\ Q F 1 1 S QT PQGQQGLLQA \ QNLLTQLPRQS QAN 
LLQSQPRI\TLTSQPATPTCTIAATPIQTLPQSQSTPKRIDTPS 
LEE P \ S DLE E LEQ FAKT F KQRR I KLG FT\QGDAG LAMVKL YGND 
FS PTT I FRFEALNLS FKNMCKLKPLLEKWLNDAENLSSDS SLSS 
PS ALNS PG I EG LS RRR KKRTS I E A\ N I R VALE KS FLEN\ QKPTS 
EEITMIADQLNMEKGVIRVWFCNRRQKEKRINPPSSGG\TSSSP 
I KAI FPS PTSLVATTPS LVTS S AATTLTVSPVLPLTSAAVTNLS 
VTGTS DTTSNNTATVI STAP PASS AVTS PSLS PS PSASASTSEA 
SSASETSTTQTTSTPLSSPLGTSQVMVTASGLQTA/AQLLPFKG 
AAQLPANASLAAMAAAAGLNPSLMAPSQFAAGGALLSLNPGTLS 
GALS P ALMSNS T LAT I QALAS GGS L P I TSLDATGNL VFANAGGA 
PNI VTAPLFLNPQNLS LLTSN P VSLVS AAAASAGNSAPVASLHA 
TSTSAESIQNSL FTVASAS GAASTTTTAS KAQ 


6123 


3 


2944 


HLLHRWFGTDMQMINFTTGEFQLTEACPYLGTHSEESRFGILHL 
HLQPLEMKRVG WFTPADYGKVTSLI LI RNNLTV IDMIGVEGFG 
ARELLKVGGRLPGAGGSLRFKVPESTLMDCRRQLKDS KQI LS IT 
KNFKVENIGPLPITVSSLKINGYNCQGYGFEVLDCHQFSLDPNT 
SRDIS I VFTPDFTS SWVIRDLSLVTAADLEFRFTLNVTLPHHLL 
PLCADWPG PS WEES FWRLTV F FVS LS LLG VIL I AFQQAQ YI LM 
EFMKTRQRQNAS SS SQQNNGPMDVIS PHS YKSNCKNFLDTYGPS 
DKGRGKNCLPVNTPQSRIQNAAKRSPATYGHSQKKHKCSVYYSK 
HKT^TAAA^ ^T^TTTFFrCOT^ PLGSSLPAAKEDICTDAMRENWI 
SLRYASGINVNLQKNLTLPKNLLNKEENTLKNTIVFSNPSSECS 
MKEGIQTCM FPKETD I KTSENTAE FKERELCPLKTS KKLPENHL 
PRNSPQYHQPDLPEISRKNNGNNQQVTVKNEVDHCENLKKVDTK 
PSSEKKIHKTSREDMFSEKQDIPFVEQEDPYRKKKLQEKREGNL 
QNLNWSKSRTCRKNKKRGVAPVSRPPEQSDLKLVCSDFERSELS 
SDINVRSWCIQESTREVCKADAEIASSLPAAQREAEGYYQKPEK 
KCVDKFCSDSSSDCGSSSGSVRASRGSWGSWSSTSSSDGDKKPM 
VDAQHFLPAGDS VS QNDFPS EAP I S LNLSHN I CNPMTGNSLPQY 
AE PS CPS LP AGPTGVE EDKGLYS PGDLWPTP PVCVTSSLNCTLE 
NGVPCVI QES APVHNS F I DWS ATCEGQFSSAYCPLELNDYNAFP 
E ENMNYANGFPCPADVQTDF I DHNSQSTWNTP P\NMPAS \ WGKA 
QFPSSSRPYLKSTPKACLPMSGLFGPI\WAP\QSDVYENCCPIN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine , K^Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y-Tyrosine, X«Unknovn, *=Stop 
Codon, /-possible nucleotide deletion, 
\»po9sible nucleotide insertion) 








PTTEHSD/THMENQA\VVCKEYYPGF\NPFRAYMNLDIWTTT\A 
NRNANFPbSRDSSYCGNV 


6124 


1573 


236 


SDEALRLAGERGMGRVQLFE I S LSHGRWYSPGE PLAGTVRVRL 
GAPLP FRAI RVTCI GSCGVSNKANDTAWWEEGYFNS SLS LADK 
G S LPAGEHS FPFQFLLPATAPTS FEGPFGKIVHQVRAAIHTPRF 
SKDHKCSLVFYILSPLNIiNS I PDIEQPNVASATKKFS YKLVKTG 
S WLTAS TDLRG YWGQALQLHADVENQSGKDTS P WASLLQKV 

S YKAKRW I HDVRT I AEVEGAG VK A W R T? A DWRPD T T.VD B T DD CAT 

PGCSLIHIDYYLQVSLKAPEATVTLPVFIGNIAV/NPCPSEPPA 
RPGAASWGPTPGG\PSAPPQEEAEAEAAAGGPHFLDPVFLSTKS 
HSQRQPLLATLSSVPGAPEPCPQDGSPASHPLHPPLCISTGATV 
PYFAEGSGGPVPTTSTLILPPEYSSWGYPYEAPPSYEQSCGGVE 
PSLTPES 


6125 


1 


904 


KTCPKLTCAFTVSVPDSCCRVCRGDGELSWEHSDGDIFRQPANR 

IVQIVINNKHKHGQVCVSNGKTYSHGESWHPNLRAFGIVECVLC 
TCNVTKQECKKIHCPNRYPCKYPQKIDGKCCKVCPG / KKAKEEIi 
PGQSFDNKGYFCGEETMPVYESVFMEDGETTRKIALETERPPQV 
EVHVWTIRKGILQHFHIEKISKRMFEELPHFKLVTRTTLSQWKI 
FTEGEAQISQMCSSRVCRTELEDLVKVLYLERSEKGHC 


6126 


1224 


389 


RLLSEAPCPRSRRRFQMNPEWGQAFVHVAVAGGLCAVAVFTGIF 
DSVSVOVGYEHYARAPVAGLPAFTiAMPTTO^TiVNTMAVTT T pt cut 

HRGGAMGLGPRYTiKDVFAAMALLYGPVQWLRLWTQWRRAAVLDQ 
WLTLP I FAW P VAWCLYLDRGWRP \ WLFLS LECVS LAS YGLALLH 
P QG FEVALGAHWP AVGQALRT \HRHYG / SAT P S AT YLALG VLS 
CLGFWLKLCDHQLARWRLFQCLTGHFWSKVCDVLQFHFAFLFL 
THFNTHPRFHPSGGKTR 


6127 


1335 


463 


VLPRRCLVFWNTMDSSREPTLGRLDAAGFWQVWQRFDADEKGY 
I EE KE LD AF FLHMLM KLGTDDTVMKANLHKVKQQ FMTTQDAS KD 
GRI RMKELAGMFLSEDENFLLLFRRENPLDSS VEFMQ I WRKYDA 
DSSGFISAAELRNFLRDLFLHHKKArSEAKLEEYTGTMMKIFDR 
NKDGRLDLNDLARILALQENFLLQFKMDACSTEKRKGDFEKIFA 
YYDVSKTGALEGP\EVDGFVKDMMELVQPSISGVDLDKFREILL 
RHCDVNKDGKIQKSELALCLGLKINP 


612B 


2511 


843 


TCRMSRROLERWVWSSOOVOAJ?ftRNVT?ftPPT.rcVT &m/*2T pmccvti '" 
SPGSLDGRAWEDAQKPQSAWCGGRKTRVYATSSRRAPPSEGTRR 
GGAARPEKTAEEGPPAAPGSLRHSGPLGPHACPTALPEPQVTSA 
MSSQWGI EPLYI KAEPASPDSPKGSSETETEP PVALAPG\ PAP 
TRCLPGHKEEEDGEGAGPGEQGGGKLVLSSLPKRLCLVCGDVAS 
G YHYGVAS CEACKAFFKRTI QGS IE YSC PASNE CE I TKRRRKAC 
QACRFTKCLRVGMLKEGVRLDRVRGGRQKYKRRPEVDPLPFPGP 
FPAGPLAVAGGPRKTAAPVNALVSHLLWEPEKLYAMPDPAGPD 
GHLPAVATLCDLFDREI WTISWAKS I PGFSSLSLSDQMSVLQS 
VWMEVLVLGVAQRSLTLQDELAFAEYLVLDEEGARPAGLGELG\ 
AALI^LVRRLQALRLER^E YVLLKALALANSDSVHI EDEPRLWS 
SCEKLLHEALLEYEAGRAGPGGGAERRRAGRLLLTLPLLRQTAG 
K VLAH F YG VKLEG KV PMHKL FLE MLEAMMD 


6129 


1764 


771 


ARFARSAHEGKMPKKKTGARKKAENRREREKQLRASRSTIDLAK 
HPCNASMECDKCQRRQKNRAFCYFCNSVQKLPICAQCGKTKCMM 
KSSDCVI KHAG V Y S TGLAMVGA I CDFCEAWVCHGRKCLSTHACA 
CPLTDAEC\VECERGVWDHGGRIFSCSFCHNFLCEDDQFEHQAS 
CQVLEAETFKCVSCNRLGQHSCLRCKACFCDDHTRSKVFKQEKG 
KQPPCPKCGHETQETKDLSMSTRSLKFGRQTGGEEGDGASGYDA 
YWKNLSSDKYGDTSYHDEEEDEYEAEDDEEEEDEGRKDSDTESS 
DLFTNLNLGRTYASGYAHYEEQEN 


6130 


3 


577 


GRGGTMRE YKVWLGSG \GVGKS ALTV\QFVTCTF I E KY DPT I E 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H»Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=:Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, YsTyrosine, X -Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DFYRKEIEV\DSSPSVAGISWTQQGTEQF\ASMRDLYIKKGQGC 
I LVYS LVNQQS FQ \ D I KFMRDQ 1 1 RVKVS E KVP VI \ LVGN \ S VD 
LESEREVSSSEGRALAEEWGCPFMETSAXSKTMVDELFAEIVRQ 
MNYAAQPDKDDPCCSACNIQ 


6131 


3 


1811 


SSPREKTSDSSHRPSRHGFLFLRLVGLSPFSYLCVPPSRPVPGS 
PRS LS AMRLLPLAPGRLRRGS PRHLPS CS PALLLLVLGGCLGVF 
GVAAGTRRPNWLLLTDDQDEVLGGMTPLKKTKALIGEMGMTFS 
SAYVPSALCCPSRAS ILTGKYPHNHHWNNTLEGNCSS KSWQKI 
QEPNTFPAILRSMCGYQTFF\AGKYLNEYGAPDAGGLEHVPLGW 
S YWYALEKNSKYYNYTL.S INGKARKHGENYS VDYLTDVLANVSL 
DFLD YKSNFE P FFMMTATP \APHS P WTAAPQ YQKAFQNVFAPRN 
KNFNIHGTNKHWL I RQ AKTPMTNS S IQFLDNAFRKRWQTLLSVD 
DLVEKLVKRLEFTGELNNTYI FYTSDNGYHTGQFSLP IDKRQLY 
EFD I KVPLLVRGPG I KPNQTS KKLVANIDLGPTILDI AGYDLNK 
TQMDGMS L L P I LRG AS NLTWRS D VL VE YQGEGRNVTD P TCPSLS 
PGVSQCFPDCVCEDAYNNTYACVRTMSALWNLQYCEFDDQEVFV 
EVYNLTADPDQITNIAKTIDPELLGKMNYRLMMIjQSCSGPTCRT 
PGVFDPGYRFDPRLM FSNRGS VRTRR FSKHLL 


6132 


96 


1241 


aagllppglvpedprrtrnllpfgiqgppfalsrplfscvesgw 
aweamepeflydllqlpkgveppaeeelskggkkkylpptsrkd 
pkfeelqkpa\vlmewinatllpehiwrsleedmfdglilhhl 
fqrlaalkleaedialtatsqkhkltwleavnrsNcswrsgrp 
SGA/ WES I fnkdllstlhllvalakrfqpdls lptnvq ve viti 

ES TKSGLKS EKLVEQLTE YS TDKDE P PKDVFDELFKLAPE KVNA 
VKEAIVNFVNQKLDRLGLSVQNLDTQFADGVILLLLIGQLEGFF 
LHLKEF YLTPNS PAEMLHNVTLALELL/ IGRGPAQLPC /LALK/ 
TI VNKDAKS TLRVLYGLFC KHTQKAHRDRTPHGAPN 


6133 


2 


4256 


FVHGSMADTDLFMECEEEELEPWQKISDVIEDSWEDYNSVDKT 
TTVS VSQQPVSAP VP IAAHAS VAGHLSTSTTVSS SGAQNS DSTK 
KTLVTLIANIWAGNPLVC^GGQPLILTQNPAPGIjGTMVTQPVLR 
P VQVMQNANHVT S S P VAS Q P I F ITTQGFP VRNVR P VQNAMNQ VG 
I VLNVQQGQTVRP ITLVPAPGTQFVKPTVGVPQVFSQMTP vrpg 
STMPVRPTTNTFTTVIPATLTIRSTVPQSQSQQTKSTPSTSTTP 
TATQP TSLGQLAVQ.S PGQSNQTTftPKLAPS FP S P PAVS IAS FVT 
VKRPG VTGENS NEVAKLVNTLNT I PS LGQS PG P VWSNNS S AH \ 
GSQRTSGPESSMKVTSSIPVFDLQDGGRKICPRCNAQFRVTEAL 
RGHMCYCCPEMVEYQKKGKSLDSEPSVPSAAKPPSPEKTAPVAS 
/ THPS STP I PALS P P Y/TKVPE PNENVGDAVQTKL I MLVDDFYY 
GRDGGKVAQLTNFPKVATS FRCPHCTKRliKNNIRFMNHMKHHVE 
LDQQNGEVDGHTICQHCYRQFSTPFQLQCHLENVHSPYESTTKC 
KI CEWAFE SEPLFLQHMKDTHKPGEMP YVCQVCQ YRSSLYSEVD 
VHFRMIHEDTRHLLCPYCLKVFKNGNAFQQHYMRHQKR\NVYH\ 
CNKCRVQFLFAKDKIEHKLQHHKTFRKPKQLEGLKPGTKVTIRA 
SRGQPRTVPVSSNDTPPSALQEAAPLTSSMDPLPVFLYPPVQRS 
IQKRAVRKMSVMGRQTCLECSFEIPDFPNHFPTYVHCSLCRYST 
CCSRAYANHMINNHVPRKSPKYLALFKNSVSGIKLACTSCTFVT 
S VGDAMAKHLVFNPSHRSSS I LPRGLTWIAHSRHGQTRDRVHDR 
NVKNMYPPPSFPTNKAATVKSAGATPAEPEELLTPLAPALPSPA 
S TATPP PTPTHPQALALPPLATEGAECLNVDDQDEGSPVTQEPE 
LASGGGGSGGVGKKEQLSVKKLRVVLFALCCNTEQAAEHFRNPO 
RRIRRWLRRFQASQGENLEGKYLSFEAEEKLAEWVLTQREQQLP 
VNEETLFQKATKIGRSLEGGFKISYEWAVRFMLRHHLTPHARRA 
VAHTLPKD VAENAGLF I DFVQRQ I HNQDL PLSM I VA I DE I S LFL 
DTEVLSSDDRKENALQTVGTGEPWCDVVLAILADGTVLPTLVFY 
RGQMDQPANMPDSILLEAKESGYSDDEIMELWSTRVKQKHTACQ 
RS KGM L VtyDCHRTHLS E E VLAMLS AS S TL P A WPAGCS S KI QPL 
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SEQ 

TD 

NO: 


Predicted 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
ti 1 1 f 1 »ntid^ 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*Alanine, C»Cysteine, D~Aspartic Acid/ E— 
Glutamic Acid, F- Phenyl alanine , G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N*=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T= Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, +-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DVC I KRTVKNFLHKKWKEQAREMADTACDSDVLLQLVLVWLGEV 
LGVIGDCPELVQRSFLVASVLPGPDGNINSPTRNADMQEEL1AS 
LEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLFEGESETES 
FYGFEEADLDLMEI 


6134 


2 


4256 


FVHGSMADTDLFMECEEEELEPWQKISDVIBDSWEDYNSVDKT 
TTVSVSQQPVSAPVPIAAHASVAGHLSTSTTVSSSGAQNSDSTK 
KTLVTLI ANNNAGNP LVQQGGQPL I LTQNPAPGLGTMVTQP VLR 
PVQVMQNANHVTSSPVASQPIFITTQGFPVRNVRPVQNAMNQVG 
I VLNVQQGQTVRP ITLVPAPGTQFVKP TVGVPQVFSQMTP VRPG 
S TM PVR PTTNT FTTV I P ATLT IRS TVP QS Q£ Q Q T KS TP STS TTP 
TATQPTSLGQLAVQSPGQSNQTTNPKLAPSFPS PPAVS IAS FVT 
VKR PG VTGENSNE VAKL VHTLNT IPS LGQS PGP VWSNN S S AH\ 
GSQRTSGPESSMKVTSSIPVFDLQDGGRKICPRCNAQFRVTEAL 
RGHMC YCCPEHVE YQKKGKS LDSE P S VPSAAKP PS P EKTAPVAS 
/THPSSTPIPALSPPY/TKVPEPNENVGDAVQTKLIMLVDDFYY 
GRDGGKVAQLTNFPKVATSFRCPHCTKRLKNNIRFMNHMKHHVE 
LDQQNGEVDGHTI CQHC YRQ FSTPFQLQCHLENVHS P YESTTKC 
KICEWAFESEPLFLQHMKDTHKPGEMPYVCQVCQYRSSLYSEVD 
VHFRM IHEDTRHLLCP YCLKVFTCNGNAFQQHYMRHQKR \NVYH\ 
CNKCRVQFL FAKDKI EHKLQHHKT FRKP KQLEGLKPGTKVT I RA 
SRGQPRTVPVS SNDTPPSALQEAAPLTS SMDPLP VFLYPP VQRS 
IQKRAVRKMSVMGRQTCLECSFEIPDFPNHFPTYVHCSLCRYST 
CCSRAYANHM I NNHVPRKS P KYLAL F KNS VSG I KLACTS CTF VT 
S VGDAMAKHLVFNPSHRS S S I LPRGLTW IAHSRHGQTRDRVHDR 

STATPPPTPTHPQALALPPLATEGAECLNVDDQDEGSPVTQEPE 
LiASGGGGSGGVGKKEQLSVKKLRWLFALCCNTEQAAEHFRNPQ 
RRIRRWLRRFQASQGENLEGKYLSFEAEEKLAEWVLTQRBQQLP 
VNEETLFQKATKIGRSLEGGFKISYEWAVRFMLRHHLTPHARRA 
VAHTLPKDVAENAGLFIDFVQRQIHNQDLPLSMIVAIDEISLFL 
DTE VLS S DDRKENALQTVGTGE P WCDWLAI LADGTVLPTLVFY 
RGQMDQPANMPDS ILLEAKESGYSDDE IMELWSTRVWQKHTACQ 
R S KGM LVMDCHRTHL SEE VLAMLS AS S TL PA WPAGCS SKI Q PL 
DVC I KRTVKNFLHKKW KEQ AREMADTACDS DVLLQLVL VWLGE V 
LGVIGDCPELVQRSFLVASVLPGPDGNINSPTRNADMQEELIAS 
LEEQLKLSGEHSESSTPRPRSS PEET I E PESLHQLFEGESETES 
FYGFEEADLDLMEI 


6135 


2 


4256 


FVHGSMADTDLFMECEEEELE P WQKI SDVI EDSWEDYNSVDKT 
TTVSVSQQPVSAPVPIAAHASVAGHLSTSTTVSSSGAQNSDSTK 
KTLVTLIANNNAGNPLVQQGGQ PL I LTQNPAPGLGTMVTQPVLR 
P VQVMQNANHVTS S P VASQP I F I TTQGFP VRNVR P VQNAMNQVG 
IVLNVQQGQTVRPITLVPAPGTQFVKPTVGVPQVFSQMTPVRPG 
STMPVRPTTNTFTTVIPATLTIRSTVPQSQSQQTKSTPSTSTTP 
TATQPTS LGQLAVQS PGQSNQTTNPKLAPS FP S PPAVS IAS FVT 
VKRPGVTGENSNEVAKLVNTLNTIPSLGQSPGPVWSNNSSAH\ 
GSQRTSGPESSMKVTSSIPVFDLQDGGRKICPRCNAQFRVTEAL 
RGHM CYCC P EMVE YQ KKG KS LDS E P S VP S AAKP PS PE KTAPVAS 
/ THPSSTP I PALS PP Y / TKVPEPNENVGDAVQT KL IMLVDDFYY 
GRDGGKVAQLTNFPKVATS FRCPHCTKRLKNN IRFMNHMKHHVE 
LDQQNGEVDGHTICQHC YRQ FSTP FQLQCKLENVHSP YESTTKC 
KICEWAFESEPLFLQHMKDTHKPGEMPYVCQVCQYRSSLYSEVD 
VHFRMIHEDTRHLLCPYCLKVFKNGNAFQQHYMRHQKR\NVYH\ 
CNKCRVQFLFAKDKIEHKLQHHKTFRKPKQLEGLKPGTKVTIRA 
SRGQPRTVPVSSNDTPPSALQEAAPLTSSMDPLPVFLYPPVQRS 
I QKRAVRKMSVMGRQTCLECS FE I PDFPNHFPTYVHCSLCRYST 
CCSRAYANHMINNHVPRKSPKYIJ^FIG^SVSGIKLACTSCTFVT 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
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corresponding 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E- 
Glutamic Acid, F»Phenylalanine, G*Glycine, 
H-Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine f T=Threonine, V=Valine, 
W=Tryptophan, Y*Tyrosina, X= Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








S VGD AMAKHLVFNPS HRS S S I L P RGLT W I AHS RHGQTRDRVHDR 
NVKNMYPPPSFPTNKAATVKSAGATPAEPEELLTPLAPALPSPA 
STATPPPTPTHPQALALPPLATEGAECLNVDDQDEGSPVTQBPE 
LASGGGGSGGVGKKEQLSVKKLRVVLFALCCNTEQAAEHFRNPQ 
RRIRRWLRRFQASO^ENLEGKYLSFEAEEKIiAEWVLTQREQQLP 
WEETLFQKATKIGRSLEGGFRISYEWAVRFMLRHHLTPHARRA 
VAHTLPKDVAENAGLFIDFVQRQIHNQDLPLSMIVAIDEISLFL 
DTEVLSSDDRKENALQTVGTGEPWCDVVLAIIADGTVLPTIjVFY 
RGQMDQPANM PDSILLEAKESG YS DDE IMELWS TRVWQKHTACQ 
RSKGMLVMDCHRTHLSEEVLAMLSASSTLPAWPAGCSSKIQPL 
DVC I KRTVKNFLHKKWKEQAREMADTACDSDVLLQLVLVWLGEV 
LGVI GDCP E L VQRS FLVAS VLP G P DGN I NS PTRNADMQ EEL IAS 
LEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLFEGESETES 
FYGFEEADLDLMEI 


6136 


1704 


539 


fgvrmalegmskrkrkrsvqegenpddgvrgsppedyrlgqvas 
slfrgehhsrggtgrlaslfsslepqiqpvyvpvpkNesalasa 
dleeeihqkqgokrknsqpgvkvadrkilddtedtwsqrkkiq 
inqeeerlknertvfvgnlpvtcnkkklksffkeygqiesvrfr 
s l i p aegt ls kkiiaa ikrkihpdqkni naywf kees aatqalk 

RNGAQIADGFRIRVDLASETSSRDKRSVFVGNLPYKVEESAIEK 
HFLDCGSIMAVRIVRDKMTGIGKGFGYVLFENTDSVHLALKLNN 
S E LMGRKLR VMRS VNKE KFKQQNS N P RLKNVS KP KQGLN FTS KT 
AEGH P KSL F I GE KAVLLKTKKKGQKKS GR P KKQR KQK 


6137 


141 


2656 


RALRKRRCGPGRRGALGSGPGPQRRPGRVPEERPAPPRERKHPG 
MWNML I VAMCLA\ LLGLPGKAQELQGHVS \ 1 1 LAGEQLGDLAKK 

ylwog\lfqlyldeagrghsfsfhgaaltapkqgqelmakales 
lscpkdmapshcaehkdqflqlsqyrqlktaedyqalnkdieaq 
lqhaglreagg i fyfsvp p fayed i arninsscrpg pgawlrw 
lekpfghdhfsaqqlatelgtffqeeemyrvdhylgkqavaqil 
pfrdqnrkaldglwnrhhvervei i mketvdaegrts fyee ygv 

IRDVLQNHLTEVLTIjVAMELPHNVSSAEAVLRHKLQVFQALRGL 
QRGSAWGQYQS YS EQVRRELQKPDS FHSLTPTFAGVLVH IDNL 
RWEGVP F I LM SGKALDERVGYARI LF KNQACC VQS E KHWAAAQS 
QCXPRQLVFHIGHGDLGSPAVLVSRNLFRPSLPSSWKEMEGPPG 
IiRLFGS PLSD YYAYS PVRERDAHSVLLSHI FHGRKNFFITTENL 
LAS WNFWTPLLESLAHKAPRLYPGGAENGRLLDFEFSSGRLFFS 
QQQPEQLVPGPGPGPMPSDFQVLRAKYRESSLVSAWSEELISKL 
ANDIEATAVRAVRRFGQFHLALSGGSSPVALFQQLATAHYGFPW 
AHTHLWLVDERCVPLSDPESNFG^IiQAHLLQHVRIPYYNIHNAM 
PVHLQQRL(^EEDQGAHIYAREISALGANSSFDLVLLGMGADGH 
TASLFPQSPTGLDGEQLVVLTTSPSQPHRRMSLSLPLINRAXKV 
AVLVMGRMKRE I TTLVSRVGHE PKKWP I SGVLPHSGQLVWYMDY 
DAFIiG 


6138 


45B7 


934 


EFSKLTDRWQNAVQGVRQRKGDVDGLVRQWQDFTTSVENLFRFL 
TDTSHLLSAVKGQERFSLYQTRSLIHELKNKEIHFQRRRTTCAL 
TLEAGEKLLLTTDLKTKESVGRRI SQLQDSWKDMEPQLAEM I KQ 
FQSTVETWDQCEKKIKELKSRLQVLKAQSEDPLPELHEDLHNEK 
ELI KELEQ S LAS WTQNL KELQTMKADLTRHVL VEDVMVLKEQ I E 
HLHRQWEDLCLRVAIRKQEIEDRLNTWVVFNEKNKELCAWLVQM 
ENKVLQTADI S IEEMIEKLQKDCMEEINLFSENKLQLKQMGDQL 
IKASNKSRAAEIDDKLNKINDRWQHLFDVIGSRVKKLKETFAFI 
QQLDKNMSNLRTWLARIESELSKPVVYDVCDDQEIQKRLAEQQD 
LQRDIEQHSAGVESVFNICDVLLHDSDACANETECDSIQQTTRS 
LDRRWRN I CAMS MERRMKI EETWRLWQKFLDDYSRFEDWLKSAE 
RTAACPNS SEVLYTS AKE ELKR FEAFQRQ I HERLTQLEL I NKQY 
RRLARENRTDTASRLKQMVHEGNQRWDNLQRRVTAVLRRLRHFT 
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corresponding 
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amino acid 
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amino acid 
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Predicted end 
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corresponding 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycirie, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
WsTryptophan, Y=Tyrosine, X»Unknovn, *-Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








NQ RE E F EGTRE S I L VWLTEMDLQLTNVEH FSES D ADD KMRQLNG 
FQQE I TLNTNKI DQLI VFGEQL I QKS EP \LDAVL I EDELEELHR 
YCQEVFGRVSRFHRRLTSCT PGLEDE KEAS ENETDMEDPRE I QT 
DSWRKRGESEEPSSPQSLCHLVAPGHERSGCETPVSVDS\IPLE 
WDHTGRRGGPSSSH\EEDEEAQYY\SALSGKSISDGHSWHVPDS 
PSCPEHHYKQMEGDRNVPPVPPASSTPYKPPYGKLLLPPGTDGG 
KEGPRVLNGNPQQEDGGLAG I TEQQSGAFDR WEM I QAQEL \HNK 
LK I KQNLQQLNSD I SAITTWLKKTEAELEMLKMAKPPSDI QE I E 
LRVIOUjQEILKAFDTYKALWSVNVSSKEFLQTESPESTELQSR 
LRQLS LL WEAAQGAVDS WRGGLRQ S LMQ CQD FHQ LS QNLLLWLA 
SAKNRRQKAHVTDPKADPRALLECRRELMQLEKELVERQPQVDM 
LQ E I SNS L L I KGHGEDC I EAE E KVHVI \ E KKL KQLRE Q VSQD LM 
ALQGTQN PAS PLP S FDEVDS G DQ P PATS V P AP RAKQ FRAVRTT E 
GEEETESRVPGSTRPQRSFLSRWRAALPLQLLLLIiLLLLACLL 
PSSEEDYSCTQANNF\ARSFYPMLRYTNGPPPT 


6139 


52 


1131 


LGDWVWSRTCGVLETPTSVLRRARARGPCPTDSKWALPRLREGE 
TFR R PW E AS <> W KTT . /LAG N I GGAAS VI VGHP LDTVKTRLO AG VG 
YGNTLSCIRWYRRESMFGFFKGMSFPLASIAVYNSWFGVFSN 
TQR FLS QHRCGEPRAS PPRTLS DLLLASMVAG WSVGLGGPVDL 
I K I RLQMQT P P VSGRQPR FEVQG S GSCG \ E PAYQG P VHC I TT I V 
RNEGLAG L YRG ASAMLLRDVPG Y CL YF I P YV FLS EW I TPEACTG 
PSPCAVWLAGGMAGAISWGTATPMDWKSRLQADGVYLNKYKGV 
LDCISQSYQKEGLKVFFRGITVNAVRGFPMSAAMFLGYELSLQA 
IRGDHAVTSP 


6140 


OX** 


J. J o 


RPFT.FTiWRT.RSRSWRPliGVPRRCHRRNWKEPVRAOPLSVTVWAP 
RCQRP /QPPAPEPSSPNAAVPEAI PTPRAAASAALELPLGPAPV 
SVAPQAEAEARSTPGPAGSRLGPETFRQRFRQFRYQDAAGPREA 
FRQLREL/S PRQWLRPDI \RTKEQ\ IVEMLVQEQLLAILPEAAR 
ARR I RRRTDVR I TG 


6141 


2 


984 


AQVGPRSRP CKMPLKLRGKFCKAKS KETAGLVEGEPTGAGGGS LS 
AS RAPARRL VFHAQLAHG S ATGR VEG FSS IQ EL YAQ IAGAFE I S 
P S E I L YCT LNT P KIDMERLLGGQLGLE DFI FAH VKG I E KE VNVY 
KSEDSLGLTITDNGVGYAFIKRIKDGGVIDSVKTICVGDHIESI 
NGENIVGWRHYDVAKKLKELKKEELFTMKLIEPKKAFEIELRSK 
AGKSSGEKIGCGRATLRLRSKGPATVEEMPSETKAK\AIEKIDD 
VLELYMG IRD I DLATTN FEAGKDKVNPDEFAVALDETLGDFAFP 
DEFVFD VWGVI GDAKRRGL 


1 6142 


116 


602 


EAEGEQVCX3AKCCGDAPHVENREEETAR I GPG VMES KEERALNN 
L I VENVNQEND e kd EKEQ VANKGE PLALPLNVS e Y CVPRGNRRR 
FRVRQP I LQYRWDIMHRIiGEPQARMREENMERIGEEVRQLMEKL 
REKQLSHSLRAVSTDPPHHDHHDEFC\LMP 


6143 


2802 


270 


FRMRIFIiHCPWNQQMWKIWNLLETSLESCKAHLSIQKLLKERXQ 
\QLPVFKHRDSIVETLKRHRVVVVAGET\GSGKSTQVPHFLLED 
LLLNE WE AS KCNT VCTQ P RR I S AVS LANR VCDE LGCENGPGGRN 
SIjCG YQIRMESRACESTRLLYCTTGVLLRKLQEDGLLSNVS /HM 
FIVDEV\HER\SVQSDFLLIILKEILQKRSDLHLILMSATVDSE 
KFST YFTHCP I LR I SGRS YP VE VFHLED HE ETGFVLEKDSE YC 
Q KFLE EE E EVT I NVTS KAGG I KK YQE Y I P VQ TGAHADLNP F YQ K 
YSSRTQHA1 LYMNPHKINLDL I LELLAYLDKSPQFRN I EGAVL I 
FLPGLAHIQQLYDLLSNDRRFYSERYKVIAIiHS ILSTQDQAAAF 
TLPP PGVRKI VLATNIAETG I T I PDWFV IDTGRTKENKYKES S 
QMSS LVE T FVS KASALQRQGRAGR VRDGFC FRM YTRERFEGFMD 
YSVPEILRVPLEELCLHIMKCNLGSPEDFLSKALDPPQLQVISN 
AMNLLRKI GACELNEPKLTPLGQHLAALPVNVKIGKMLI FGAI F 
GCLDP VATLAAVMTEKS PFTTP IGRKDE ADLAKSALAMADSDHL 
TI YNAYLGWKKARQEGG YRS EI TYCRRNFLNRTSLLTLEDVKQE 
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ID 
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Predicted 
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uuLicoti tie 
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corresponding 
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amino acid 
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amino acid 
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Predicted end 
nucleotide 

XUCatlUIt 

corresponding 
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residue of 
amino acid 


Amino acid segment containing signal peptide 
(A~Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K«Lysine, 
L-Leucine, M*=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W^Tryptophan, Y*Tyroeine, X-Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








LIKLVKAAGFSSSTTSTSWEGNRASQTLS FQEIALLKAVLVAGL 
YDNVGKI IYTKSVDVTEKLACI VETAQGKAQVHPSSVNRDLQTH 
GWLLYQ E K I R YAR VYLRETTL I TP FP VLLFGGD I E VQHR ERLLS 
IDGWIYFQAPVKIAVIFKQLRVLIDSVLRKKLENPKMSLENDKI 
LQIITELIKTENN 


6144 


1289 
> 


568 


SGPGSMSGQRVDVKWMLGKEYVGKTSLVERYVHDRFLVGPYQN 
VSASGGARHGGRGSGGPVICTYG PDLFPLVA\ TIGAAFVAKVMS 
VGDRTVTLG I WDTAGSERYEAMSRI YYRGAKAAI VC YDLTDSS S 
FERAKFVTVKELRSLEEGCQIYLCGTKSDLLEEDRRRRRVDFHDV 
QDYADN I KAQLFETS S KTGQS VDELFQKVAEDYVS VAAFQVMTE 
DKG VDLGQ KPN P YF YS CCHH 


6145 


1109 


196 


GGMDLSELERDNTGRCRLSSPVPAVCRKEPCVLGVDEAGRGPVL 
GPMVYAICYCPLPRLADLEALKVADSKTLLESERERLFAKMEDT 
DFVGWALDVLSPNLISTSMLGRVKYNLNSLSHDTATGLIQYALD 
QGVNVTQVFVDTVGM PETYQAR LQQS FPG IEVTVKAKADALYP V 
\ VSAAS I CAKVAPJDQAVKKWQ FVEKLQDLDTDYG\SG YPNDPQD 
/TKAWLKEHVEPVF\GFP\QFVRF\SWRTAQTI\LEKEAEDVIR 
EDSASENQEGLRKITSYFLNEGSQARPRSSHRYFLERGLESTTS 
L 


6146 


428 


781 


LKKKGKEKAEAQQVEALPGPSLDQWHRSAGEEEDGPVLTDEQKS 
R/ YPGHEAHDQGG\WDARQS 1 1 RKWDPETGRTRLI KGDGE VLE 
EI VTKERHRE INKQATRGDCLAFQMRAGLLP 


6147 


1 


2304 


GTRQLPPPSPGSGPGDSPEGPEGEAPERRRKAHGMLKLYYGLSE 
GEAAGRPAGPDPLDPTDLNGAHFDPEVYLDKLRRECPLAQLMDS 
ETDMVRQIRALDSDMQTLVYENYNKFISATDTIRKMKNDFRKME 
DEMDRLATNMAVI TD FS AR I SATLQDRHER I TKLAGVHALIiRKL 
QFLFELPSRLTKCVELGAYGQAVRYQGRAQAVLQQYQHLPSFRA 
IQDDCQ VI TARLAQQLRQRFREGGSGAPEQAECVELLLALGE PA 
EELCEEFLAHARGRLEKELRNLEAELGPSPPAPDVLEFTDHG\S 
SG FVGGLCQ VAAAYQEL FAAQGPAGAE KLAAFARQLGSR YFALV 
ERRLAQEQGGGDNS LLVRALDRFHRRLRAPGALLAAAGLADAAT 
EIVERVARERLGHHLQGLRAAFLGCLTDVRQALAAPRVAGKEGP 
GLAELLANVAS S I LSH I KASLAAVHLFTAKEVS FSNKPYFRGEF 
CSQGVREGLIVGFVHSMCQTAQSFCDSPGEKGGATPPALLLLLS 
RLCLD YETAT I S Y I LTLTDEQFLVQDQ FPVTP VSTLCAEARETA 
RPJLLTHYVKVQGLVISQMLRKSVETRDWLSTLEPRNVRAVMKRV 
VEDTTAIDVQVLPRLAGVALTQAGGTVPSRGAGAAEDHWQSLPG 
GGDMC I WASHGAS S VARAS VREPQGNKS PRMNTKRAGECLCPRS 
CSFSAQDYDIPAPILPVEKQRLRVTQEVRAGLVLVLKIRPQTNS 
CILPLPHSTGS INSDHVPTK 


6148 


3056 


353 


VPAVGGTFADGAMGEAEKFHYIYSCDLDINVQLKIGSLEGKREQ 
KSYKAVLEDPMLKFSGLYQETCSDLYVTCQVFAEGKPIiALPVRT 
SYKAFS TRWNWNE WLKLPVKYPDLPRNAQVALTI WDVYGPGKAV 

SSTLS EDQMSRLAKLTKAHRQGHMVKVDWLDRLTFRE I EM INES 
VKRSSNFMYLMGGFRCVKCDDKEYGIVYYEKDGDESSPILTSFE 
LVKVPDPQMSLENLVESKHHNLPRSLRSGPSDHDLKPYPSPRDQ 
LKNIVSYPPSKPPTYEEQDLVWEFRYYLTNQDKALTKILTSVIW 
DLPQGAKQALALLGKWKPMDVEDSLELLSSHYTNPTVRRYAVAR 
LRQADDEDLLMYIiLQLVQALKYENFDDIKNGLEPTKKDSQSSVS 
ENVSNSGINSAEIDSSQIIT/SAPFPSVSSPPP\ASKTKEVPDG 
ENLEQDLC TFLI S RAS KNSTLAN YLYWYVI VE CEDQDTQQRD P K 
THEMYLNVMRRFSQALLKGDKSVRVMRSLLAAQQTFVDRLVHLM 
KA VQRE S GNRKKKNE RLQALLGDNE KMNLS D VEL I PLPLE PQ VK 
IRGII PETATLFKSALMPAQLFFKTEDGGKYPVI FKHGDDLRQD 
QLILQIISLMDKLLRKENLDLKLTPYKVLATSTKHGFMQFIQSV 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C*=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine/ G»Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
li=Leucine / M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








P VAEVLDT EGS I QNF FRKYAP S ENGPNG I S AEVMDTYVKS CAG Y 
CV I TY I LGVGDRHLDNLLLTKTGKLFH I D FGYI LGRDPKPLP P P 
MKiNKEMVEGMGGTQSEQYQEFRKQCYTAFLHLRRYSNLILNLF 
S LM VD AN I PD I ALE P DKTVKKVQDKFRLDLS DEE AVH YMQSL I D 
E S VHALFAA WEQ I H KFAQ Y WR K 


6149 


1 


1413 


RVDPRVRENGTANP I KNGKTS PAS KDQRTG KKTS VQGQVQ KGND 
E S E S D F E S D P P S P KS F FFFODDF F MT SYZ WCVzn punn rrnj d ttxtt 

GHRPLLMDSEDEEEEEKHSSDSDYBQAKAKYSDMSSVYRDRSGS 
GPTQDLNTILLTSAQLSSDVAVETPKQEFDVFGAVPFFAVRAQQ 
P QQE KNE KNL PQHR F PAAGLEQE E FDVFTKAP FS KKVNVQE CHA 
VGPEAHTI PGYPKS VDVFGSTP FQPFLTSTS KS ESNEDLPGLVP 
FDE ITGSQQQKVKQRSLQKLS SRQRRTKQDMSKSNGKRHHGTPT 
STKKTLKPTYRTPERARRHKKVGRRDSQSSNEFLTISDSKENIS 
VALTDGKDRGNVbQPEESLLDPFGAKPFHSPD\LSWHPP\HQGL 
S\DIRADHNT\VLPGR\ PRONSIiHG^FH^ADVT.KMnnpnauD Jv 
LTELWQSITPHQSQQSQPV\BLDPFGAAPFPSKQ 


6150 


372 


37 


MSNI KKYI IDYDWKAS I E I El DHDVMTEEKLHQINNFWSDSE YR 
LNKHGS VLNAVL IMLAQHALL I AI S SDLNAYGWCE FDWNDGNG 
QEGWPPMDGSEGIRITDIDTSGIF 


6151 


1555 


521 


D SNQQ S VS GTAAS TLLHS FKAT I Y YQGTGHVQQF YG VT S P YSQT 

TPP I VOS YA.O PSTiDY T nnrtDT FT2t wonfi^nnrn D tv tv T^trrT' t \n\ n/^ 
4 * + * W* 3 i^vr uu^ x lyoyyir l/iH.f\^j v V V \Jirf\t\f\\ 111 VArU 

Q PQPLQPS EMWTNNLLDIjPPPSP PKPKTI VLPPNW KTARDPEG 
KIYYYHVITRQTQWDPPTWESPGDDASLEHEAEMDLGTPTYDEN 
PMK\ASKKPKTAEADTSSELAKKSKEVFRKEMSQFIVQCLNPYR 
KPDCKVG\RITTTEDFKHLARKLTHGVMNKELKYCKNPE\DLEC 
NENVKHKTKEYIKKYMQKFGAVYKPKEDTEFRVTVGPGWEDGWS 
GKTDS RERKSCGPFCSTP VSTVLLM I HHPGB FNPADVN 


6152 


1366 


648 


NRTWSTPSTWMGVALPPLCSTGPWPVTRQITARTTCX3AVPAKCP 
PWC/DVHEPRCQPPDCHGHGrCVDGHCQCTGHFWRGPGCDELDC 
G P S NC S QHGLCTETGCR CDAGWTG SNCS EECPLGWHGPG CQRPC 
KCEHH C P CD PKTGN CS VS R VKQCLQ P PEATLRAGELS F FTRTAW 
LALTLALAFLLLISTAANLSLLLSRAERNRRLHGDYAYHPLQEM 
NGEPLAAEKEQPGGAHNPFKD 


6153 


2 


3368 


GRVGARS PGRAYALLLLLI CFNVGSGLHLQVLSTRNENKLLPIOT" 
PHLVRQKRAWITAPVALLEGEDLSKKNPIAKIHSDIiAEERGLKI 
TYKYTGKGITEPPFGIFVFNKDTGELNVTSIIjDREETPFFLLTG 
YALDARGNNVEKPLELRIKVLDINDNEPVFTQDVFVGSVEELSA 
AHTLVMKINATDADEPNTLNSKISYRIVSLEPAYPPVFYLNXDT 
GEIYTTSVTLDREEHSSYTLTVEARDGNGEVTDKPVKQAQVQIR 
I LDVNDN I P WENKVLEGM VEENQVNVE VTRI KVFDADE IGSDN 
WLANFTFASGNEGGYFHIETDAQTNEG I VTLI KEVD YEEMKNLD 
FS VI VANKAAFHKS IRS KYKPTP I PI KVKVKNVKEGIHFKSS VI 
S I YVS ESMDRS S KGQ I IGNFQAFDEDTGLPAHARYVKLEDRDNW 
I SVDSVTSEI KLAKLPDFES R YVQNGTYTVKI VAI SEDYPRKT I 
TGTVLINVEDINDNCPTLIEPVQTICHDAEYVNVTAEDLDGHPN 
SGPFSFSVIDKPPGMAEKWKIARQESTSVLLQQSEKKLGRSEIQ 
FLISDNQGFSCPEKQVLTLTVCEVLHGS\GCREAQHDSYVGLGP 
AAIALMILAFLLLLLVPLLLLMCHCGKGAKGFTPIPGTIEMLHP 
WNN E GAP P E D KWP S FL P VDQGG S LVGRNG VGGMAKEATMKG S S 
S AS I VKGQHEMSEMDGRWEEHRS LLSGRATQFTGATGAI \MTTE 
TTI TARATGAS RD VAGAQAAAVALNEE FL KN Y FTDKAAS YTE ED 
ENHTAKDCLLVYSQEETESLNASIGCCSFIEGELDDRFLDDLGL 
KFKTLAE VCLGQKID INKE IEQRQKPATETSMNTASHSLCEQTM 
VNS ENTYS SGSSFPVPKS LQEANAE KVTQE I VTERS VS SRQAQ K 
VATPLPDPMASRNVIATETSYVTGSTMPPTTVILGPSQPQSLIV 
TERVYAPAS TL VDQP YANEGTVWTERVI Q PHGGGSNPLEGTQH 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A-Alanine, C-Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
Pe Proline, Q=Glut amine, R=Arginine, 
SsSerine, T=Threonine, V= Valine, 
W«Tryptophan, Y=Tyrosine, X-Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








LQDVP YVMVRE RE S FLAP S SGVQ PTLAMPN I AVGQNVT VTER VL 
APASTLQSSYQIPTENSMTARNTTVSGAGVPGPLPDFGLEESGH 
SNSTITTSSTRVTKHSTVQHSYS 


6154 


3660 


2146 


KKKTKMKNTLQKTVNFGAWPKPTISDKSHLLQMVSKLDLTDAKN 
SDTAH IKSIEITSI LNGLQASES S AEDSEQEDERGAQDMDNNGK 
EESKIDHLTNNRNDLISKEEQNSSSLLEENKVHADLVISKPVSK 
SPERLRKDIEVLSEDTDYEEDEVTKKRKDVKKDTTDKSSKPQIK 
RGKRR YCNTEE CliKTGS PGKKEEKAXNKESLCMENSSNSSSDED 
EEETKAKMTPTiCKYNGLEE KRKS LRTTGFYSGFSE VAEKR I KLL 
NNS DERLQN S RAKDRKD VW S S I QGQW P KKTLKE L FSDS DTEAAA 
SPPHPAPEEGVAEESLQTVAEEESCSPSVELEKPPPVNVDSKPI 
EEKTVEVNDRKAEFPSSGSNFSA* I PLPYLHLNRLHQSL * QKGS 
RQQSSVTVSEPLAPNQEEVRSIKSETDSTIEVDSVAGELQDLQS 
ERE*LASRF* CQCELEQ+ * SARTRTS * KSLYRSEKSERCSGRRK 
F IKKAEKKP * SNSGKQQKEG K 


6155 


869 


121 


HLLPELRGKSWITKKYVFYLGVLAGTFFFADSSVQFCEDPAPYLV 
YL KS H FN PCVGVL I KP S WVIAP AHCYL PNLKVMLGNFKS RVRDG 
TEQTINPIQIVRYWNYSHSAPQDDLMLIKLAKPAMLNPKVQALN 
P\PTTNVRPGTVCLLSGLDWSQENSGRHPDLRQNLEAPVMSDRE 
CQKTEQGKSHRNS LCVKFVKVFSR I FGEVAVATV I CKDKLQGIE 
VGHFMGGDVG I YTNVYKYVSWIENTAKDK 


DIjO 


5725 


3984 


GTSTVTMATKKHFS I ILNLLGMLLKKDNQDTRKLLMTWALEVAV 
VMKKS ET YAPLFCLPSFHKFCKGLIiADTLVEDVN I CLQACS SLH 
ALS SSL P DDLLQRC VD VCR VQL VHRGTC I RQAFG KLLKS I PLGV 
FLSNNNHTEIQEISLALRSHMSKAPSNTFHPQDFSD/VISFILY 
GNSHRTGKDNWLERLFYSCQRLDKRDQSTIPRNLLKTDAVLWQW 
A I WEAAQ FTVLS KL RTP LGRAQDT FQT I EG 1 1 RS LAGHTLNPDQ 
DVSQWTTADNDEGHGNNQLRLVLLLQYLENLEKLMYNAYEGCAN 
ALTS P P KVI RT FL YTNRQT CQDWLTR I RLS IMR VGLLAGQ P AVT 
VRHGFDLLTEMKTTSLSG^NELEVSIMMVVEALCELHCPEAIQG 
IAVWSS S IVGKHLLWINSVAQQAEGRFEKASVEYQEHLCAMTGV 
DCCISSFDKSVLTLASAGCKSASLKHCLNGESRKSVLSKPTDSS 
PEVINYLGNKACECYISTADWAAVQEWQNAIKDLKKSTSSTSLN 
LKADFNYIKSLSSFESGKFVECTEQLELLPGENINLLAGGSKEK 
IDMKKLLRNM 


6157 


946 


329 


MANRGPSYGLSREVQEKI EQKYDADLENKLVDW I ILQCAEDIEH 
PPPGRAHFQKWLMDGTVLCKLINSLYP PGQEP I PKI SESKMAFK 
QMEQI SQFLKAAETYGVRTTDI FQTVDLWEGKDMAAVQRTLMAL 
GSVAVTKDDGCYRGEPSWFHRKAQQNRRGFSEEQLRQGQNVIGL 
QMGSNKGASQAGMTG YGMPRQ IM* DAASCP 


^158 


441 


1482 


LGSLI VLSLHCKVI FSSQS LERAMKEKAVDLVP I LAQNPGLAQN 
P ILEGKDHNQNTGVDPI IDHVQDRKTD / SRSKS PHKKRS KSRER 
RKSRSRSHSRDKRKDTREKIKEKERVKEKDREKEREREKEREKE 
KERGKNKDRDKBREKDREKDKEKDREREREKEHEKDRDKEKEKE 
QDKEKEREKDRSKEIDEKRKKDKKSRTPPRSYNASRRSRSSSRE 
RRRRRSRS SSRS PRTS KT I KRKSSRS PS PRSRNKKDKKRE KERD 
HISERRERERSTSMRKSSNDRDGKEKLEKNSTSLKEKEHNKEPD 
SSVSKEVDDKDAPRTEENKIQHNGNCQLNEENLSTKTEAV 


«Jl59 


53 


84 


AVIAPLHISLGDRARPYLKNTEKSSTTCSRRRNQSFPPVMSLTH 
RLHLCKYWGCAVSNVCRFWEGRPLPLMIWPVTLPVSLPVGSCV 
IITGTPILTFVKDPQLEVNFYTGMDEDSDIAFQFRLHFGKPAIM 
NSCVFGIWRYEEKCYYLPFEDGKPFELCIYVRHKEYKVMVNGQR 
I YNFAHRF PPAS VKMLQVFRD I SLTRVLI SD*GRCVRI TAVQEF 
DVSVSCDCTTAYQPG 


6160 


1626 


1790 


AGAKFFP * F * KVADAQPTES E KE I YNQVNWLKDAEGI LEDLQS 
YRGAGHE I REAIQHPADEKLQEKAWGAWPLVGKLKKFYE FSQR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Histidine, I«*Isoleucine, K=Lysine, 
li=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serane, T=Threonine, V* Valine, 
WsTryptophan, YsTyrosine, X= Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEAALRGLLGALTS TP YSPTQHLEREQALAKQFAE I LHFTLRFD 
ELKMTl^PAIQNDFSYYRRTIiSRMRINNVPAEGENEVNNELANRM 
SLFYAEATPMLKTLSDATTKFVSENKNLPIENTTDCLSTMASVC 
R VMLET P E YRS R FTNE ET VS F CLR VMVGV 1 1 L YDHVH P VGAFAK 
TSKIDMKGCI KVLKDQ P PNS VEGLLNALR YTTKHLNDETTS KQ I 
KSMLQ*QLLTLVNKG 


6161 


455 


1569 


P VSGSES S LRRAWAS I LRLMLG PRVAVS I LCEDGI SH * LLEKH* 
KSHVLEPLS SLALEBQCLALS LDWSTGKTGRAGDQPLKI I SS DS 
TGQLHLLMVNETRPRLQKVAS WQAHQFEAW IAAFNYWHPE I VYS 
GGDDGLLRGWDTRVPGKFLFTS KRHTMGVCS IQSS PHREHI LAT 
GSYDEHILLWDTRNMKQPLADTPVQGGVWRIKWHPFHHHLLLAA 
CMHSGFKILNCQKAMEERQEATVLTSHTLPDSLVYGADWSWLLF 
RS LQRAP S WS F P S NLGT KTADL KGAS EL PT P CHECREDNDG EGH 
ARPQSGMKPLTEGMRKNGTWLQATAATTRDCGVNPEEADSAFSL 
LATCS FYDHALHLWEWEGN 


6162 


1 


586 


RT IHATGRAGAS PMHRLI VWRLAEANKQHVRCQKCLEFGHWTYE 
CTG KRKYLHRPSRTAE LKKALKE KENRLLLQQ S I GETNVERKAK 
KKRSKSVTSSSSSSSDSSASDSSSESEETSTSSSSEDSDTDESS 
SSSSSSASSTTSSSSSDSDSDSSSSSKQ*KQHR*QL*R*TTKEE 
EKE IELLHS YWTDGLKTLM 


6163 


1081 


785 


RIRSTTEGCAVRLHPTQNTGKARIMILLSVSLGRHWAFTYKFFL 
TP WF VFFF F F FHRKE * VMQ KN PMKS REDE WME KLNNLHVQRAD 
MNRL I MNYLVTEG FKEAAEKFRMESG I EPS VDLETLDERI KI RE 
M ILKGQ I QEAI AL XNSLKPELLDTWRYLYFHLQQQHLIEL IRQR 
ETEAALEFAQTQLAEQGEESRECLTEMERTLALLAFDSPEESPF 
GDLLHTMQRQKVWSEVNQAVLDYENRESTPKLAKLLKLLLWAQN 
ELDQKKVKYPKMTDLSKGVIEEPK 


6164 


90 


406 


PCQSPGRSRMRQDICLTGSLRRGGRCLKRQGGGVGTILSNVLKKR 
SCISRTAPRLLCTLEPGVDTKLKFTLEPSLGQNGFQQWYDALKA 
VARLSTGIPKEWRRKVWLTLADHYLHS IAIDWDKTMRFTFNERS 
NPDDDSMG IQ I VKDLHRTGCS S YCGQEAEQDRWLKRVLLAYAR 
WNKTVGYCQGFNILAALILEVMEGNEGDALKIMIYLIDKVLPES 
YFVNNLRALSVDMAVFRDLLRMKLPELSQHLDTLQRTANKESGG 
G YEP P LTNVFTMQW FLTL FAT CLPNQTVLKI WDS VF FEG S E 1 1 L 
RVSLAI WAKLGEQ I ECCETADEFYSTMGRLTQEMLENDLLQSHE 
LMQTVYSMAPFPFPQLAELREKYTYNITPFPATVKPTSVSGRHS 
KARDSDEEKDPDDEDAWNAVGCLQPFSGFLAPEIiQKYQKQIKE 
PNEEQSLRSNNIAELSPOAINSCRSEYHAAFNSMMMERMTTDIN 
ALKRQYSRI KKKQQQQVHQVYIRADKGPVTS ILPSQVNSSPVIN 
HLLIX5KKMKMTNRAAKNAVIHIPGHTGGKISPVPYEDLKTKLNS 
PWRTHI RVHKKNMPRTKSHPGCGDTVGL IDEQNEAS KTHGLGAA 
E AFPSGCTATAGREGS S PEGS TRRTIEGQSPE PVFGDADVDVSA 
VQAKLGALELNQRDAAAETELRVHPPCQRHCPEPPSAPEENKAT 
S KAPQGSNS XTP I FS P FPS VKPLRKSATARNLGL YG PTERTPTV 
HFPQMSRSFSKPGGGNSGP*KMVFSSGTMLSRQLPGYPQEYQRN 
GGERFG 


6165 


90 


406 


P CQS PGRS RMRQDKLTGS LRRGGRCLKRQGGGVGT I LSNVLKKR 
SCI SRTAPRLLCTLEPG VDTKLKFTLEPS LGQNGFQQWYDALKA 
VARLSTGI PKEWRRKVWLTLADHYLHS IAIDWDKTMRFTFNERS 
NPDDDSMGIQIVKDLHRTGCSSYCGQEAEQDRWLKRVLLAYAR 
WNKTVG YCQGFNILAAL I LEVMEGNEGDALKIMI YL IDKVLPES 
YFVNNLRALSVDMAVFRDLLRMKLPELSQHLDTLQRTANKESGG 
GYEPPLTNVFTMQWFLTLFATCLPNQTVLKIWDSVFFEGSEIIL 
RVSLAIWAKLGEQIECCETADEFYSTMGRIiTQEMLENDLLQSHE 
LMQTVYSMAPFPFPQIJ^ELREKYTYNITPFPATVKPTSVSGRHS 
KARDS DEEND PDDEDAWNAVGCLG P FSG FLAPE LQKYQ KQ I KB 
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ID 
NO: 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl a 1 anine , G=Glycine / 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R^Arginine, 

o — O C L. J_ 1 1^ | l — xvli- cuii iiic / vBvaiiiiCj 

W-Tryptophan, Y-Tyroeine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, j 
\spossible nucleotide insertion) 








PNEEQSLRSNNLAELSPGAINSCRSEYHAAFNSMMMERMTTDIN 
ALKRQYSRIKKXQQQQVHQVYIRADKGPVTSILPSQVNSSPVIN 
HLLLGKKMKMTNRAAKNAVIHI PGHTGGKI S P VP YEDLKTKLNS 
PWRTH I RVHKKNMPRTKSH PGCGDTVGLIDBQNEASKTNGLGAA 
EAFPSGCTATAGREGSSPEGSTRRTIEGQSPEPVFGDADVDVSA 
VQAKLGALELNQ RDAAAE TE LRVHP P CQRH C P E P P S APE ENKAT 

«"» VK nr\/~i £?MCi Vfrn T COTl fTJ C^rVBT D VO 7\*T*Tvm<rT./2T.vnTyT , T?l3 r PO'T*W 

SKAPQGSNSKTP I r orr P£j Vl^LiKKoAlAKNlAalj xv»r 1KK.1 r 1 1 V 
HF P QM S RS FS KPGGGNSG P * KMVFSSGTMLSRQLPG YPQE YQRN 
GGERFG 


6156 


2 


1206 


HKLWRTVAMAGAEWKS LE ECLEKHLPLPDLQEVKRVLYGKELRK 
LDLP REAFEAAS RE DFB LQG YAFBAAE EQLRRP R I VHVGL VQNR 
I PLPANAPVAEQVS ALHRR I KAI VE VAAMCGVN 1 1 CFQEAWTMP 
FAFCTREKLPWTEFAESAEDGPTTRFCQKLAKNHDMVVVS PILE 
RDS EHGDVLWNTAWI SNSGAVLGKTRKNH I PRVGDFNES TYYM 
EGNLGHPVFQTQFGRIAVNI CYGRHHPLNWLMYS INGAE I I FNP 

GDG KKAHQD FG Y FYGS S YVAAPDSSRTPGLS RS RDGLL VAKLDL 
NLCQQVNDVWNFKMTGRYEM YARE LAEAVKSNYS PTIVKE * PAS 
VPALG 


6167 


1220 


1844 


YGIVTGPSLCAGDKQPKKQEKNPVLVSPEFVDEALCACEEYLSN 
LAHMDIDKDLEAPLYLTPEGWSLFLQRYYQWHEGAELRHLDTQ 
VQRCED I LQQLQAWPQ I DMEGDRN I W I VKPGAKS RGRG IMCMD 
HLEEMLKLVNGNP WMKDGKWWQ KY I ERPLLI FGTKFDLRQWF 
LVTDWNPLTVWFYRDSYIRFSTQPFSLKNLDK*APLYLTPEGWS 
LFLQRYYQWHEGAELRHLDTQVQRCEDILQQLQAWPQIDMEG 

DRN1 W 1 V Ki'GAK.oKtaKb 1 P1L.MUHbBJc»MJjlVuVriljJN V V VriAJJOA-W V 

VQKYIERPLLIFGTKFDLRQWFLVTDWNPLTVWFYRDSYIRFST 
QPFSLKNLDK 


616B 


84 


1392 


VWPVPSVSAMPPKKQAQAGGSKKAEQKKKEKIIEDKTFGLKNKK 
GAKQQKFIKAVTHQVKFGQQNPRQVAQSEAEKKLKKDDKKKELQ 

ilLiWUbr ivr v VAAy IvlbKAjAIJr J\o V vl_Ar ri\.y*jyv» I WjU^IUf oil 

DLTLERKCEKRSVYIDARDEELEKDTMDNWDEKKLEEWNKKHG 
EAEKKKPKTQ I VCKHFLEAI ENNKYGWFWVCPGGGD I CMYRHAL 
PPGFVLKKKKKKKKXEDE ISL* DL IERERSALGPNVTKITLESF 
LAWKKRKRQEK I DKLEQDMERRKADFKAGKALVI SGRE VFEFRP 

ETGITVASLERFSTYTSDKDENKLSEASGGRABNGERSDLEEDN 
EREGTENGAIDAVPVDENLFTGEDLDELEEELNTLDLEE 


6169 


112 


662 


APAAAMAERPEDLNLPNAVITRIIKEALPDGVNISKEARSAISR 
AAS VF VLYATS CANN FAMKG KRKTLNASD VLSAMEEME FQRFVT 
PLKEALEA YRREQKG KKEAS EQ KKKD KDKKTDS E EQDKSRDEDN 
DEDEBRLE EEEQNEEEEVDN+ KGRETVAPWKVPLEMRRATCFCE 
AFPCWAE 


6170 


62 


667 


STKVMLPNTGRLAGCTVF I TG ASRG I GKAI ALKAAKDGAN I V I A 
AKTAQPHP KLLGTI YTAAEEI EAVGGKALPCI VDVRDEQQISAA 
VEKAI KKFGG ID I LVNNASAI S LTNTLDTPTKRLDLMMNVNTRG 
TYLASKACIPYLKKSKVAHIPNISPPLNLNPVWFKQHCGRW*W 
G * GDGLCL I CFE LNLCMSD V I T I CT 


6171 


382 


941 


HFMQSDVELDCDIEPCGHTKFPPTLPLSTTVIVCSCHPVATAST 
MAEAFSKTTSEEDQSIQEPKEANSMTAQKQKK*GLRGSRRRHAN 
SGGDI FGDS FAAYFPRVLKQVHQAI^SLSQEAVSVMDSMVRDI LD 
R IATEAGHLAH YS KCVTI TS RD IRMAVCLLLPGKMG KLAESQGT 
NATLRYTKSK 


6172 


651 


54 


GLCRAGGAHRFSRTHVEAALKMIiRREARLRREYLYRKAREEAQR 
SAQERKERLRRALEENRLIPTELRREAIiALQGSLEFDDAGGEGV 
TSHVDDEYRWAGVEDPKVMITTSRDPSSRLKMFAKBLKLVFPGA 
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nucleotide 
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corresponding 
to first 
amino acid 
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amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine/ G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W^Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
^possible nucleotide insertion) 








QRMNRGRH E VG AL VRACKANG VTD LL WHE HRGT P VGL I VSHLP 
FG PTAYFTLCNWMRHDI PDLGTMSEAKPHL I THGFS SRLGKRV 
SD I LRY L F P VPKDDS HR V I T FANQDD Y I S FRHHV YKKTDHRNVE 
LTEVGPRFELKLYMIRLGTLEQEATADVEWRWHPYTNTARKRVF 
LSTE*AAPRPLGQLL 


6173 


3 


288 


SVDHREVQVXiSQSMPLTPHQAVLRGERPYMCVECGKCFGRSSHL 
LQHQRIHTGEKPYVCSVCGKAFSQSSVLSKHRTIHTGEKPYECN 
ECGKAFRVSSDLAQHHKIHTGEKPHECLECRKAFTQLSHLIQHQ 
RIHTGERPYVCPLCGKAFNHSTVLRSHQRVHTGEKPHRCNECGK 
TFSVKRTLLQHQRIHTGEKPYTCSEOGKAFSDRSVLIQHHNVHT 
GEKPYECSECGFCTFSHRSTLMNHERIHTEEKPYACYECGKAFVQ 
HSHL IQHQKVHRKL * PTCVLS VGSALAGVPTS FS ISVSTLERSP 
MCAVYVGRPS ARAQS LVNTGQFTQVRS PMS VMS VEKPLE 


6174 


1060 


959 


PRPPGKRWMVAGLGNPGLPGTRHSVGMAVLGQLARRLGVAESWT 
RDRHCAADLALAPLGDAQLVLLRPRRIiMNANGRSVARAAELFGL 
TAEE VYL VH D ELDKP LGR LALKLGGS ARGHNG VRSC I S C LNSNA 
MPRLRVG IGRPAHPEAVQAHVLGCFS PAEQELLPLLLDRATDLI 
LDHIRERSQGPSLGP*H*WFSKKA 


6175 


2204 


334 


RYFRADPRS RSGQPRAEGLGAFAEGPLRAMAAP VKGNRKQS TEG 
DALDPPASPKPAGKQNGIQNPISLEDSPEAGGEREEEQEREEEQ 

LVTGRRLW KNV YNELGGS PGSTSGATCTRRHY* RLVLP YVRHLK 
GEDDKPLPTSKPRKQYKMAKENRGDDGATERPKKAKEERRMDQM 
MPGKTKADAADPAPLPSQEPPRNSTEQQGLASGSSVSFVGASGC 
PEAYKRLLSSFYCKGTHGIMSPLAKKKLLAQVSKVEALQCQEEG 

SLREEAQAGPCPAAPIFKGCFYTHPTEVLKPVSQHPRDFFSRLK 
DGVLLGPPGKEGLSVKEPQLVWGGDANRPSAFHKGGSRKG I LYP 
KP KAC WVS P MAKVPAE SPTLPPTFPSS PGLGS KRSLEEEGAAHS 
GKRLRAVSP FLKEADAKKCGAKP AGSGLVSn.T .G PAT fiPVP PT?A 
YRGTMLHCPLNFTGTPGPLKGQAALPFSPLVIPAFPAHFLATAG 
PS PMAAGLMHF PPTS FDS ALRHRLCPAS SAWHAP P VTTYAAPHF 
FHLNTKL 


6176 


1040 


402 


P LSALRAMAE VH V IGQ 1 1 GASGFS ESS LFCKWG I HTGAAWKLLS 
GVREGQTQ VDTPQ IGDMAYWSHP I DLHFATKGLQGW PRLHFQVW 
S QDS FGR CQLiAG YG FCK VP S S PGTHQLACPTWRP LG S WREQIiAR 
AFVGGG PQ LLHGDT I YSGADR YRLHTAAGGTVHL E I GLLLRN FD 
RYGVEC*GTLPPTSPPSTPRTPSDGGGWHSGQEHRL 


6177 


1400 


992 


VPIESLVGKVHNFPLIAPYCCEKGKRQPHKSLHDRCFGEALDPN 
CSHCYLDQIKRSDFLGFSGYSPHFVAISTNSEHKMQPSSMQQAL 
PSQ*PYWTDPRPALVPCCSHRPDVHRSRPGPGIiPGTSGCSDRPP 
VCPI 


6178 


1027 


254 


STQRGG I KG VARAAS LVGRRRAGTGMALLLCLVCLTAALAHGCL 
HCHSNFSKKFSFYRHHVNFKSWWVGDIPVSGALLTDWSDDTMKE 
LHLAI PAK I TRE KLDQVATA V YQMMDQL YQG KMYFPG YFPNELR 
NIFREQVHLIQNAIIESR1DCQHRCGIFQYETISCNNCTDSHVA 
CFGYNCESSAQWKSAVQGLLNYINNWHKQDTSMRPRSSAFSWPG 
THRAAPAFLVLPALRCLEPPHLAl^LSLEDAA+CLKQH 


6179 


806 


276 


RGETREMAGNLLSGAGRRLWDWVPLACRSFSLGVPRLIGIRLTL 
P PPKWDRWNEKRAM FGVY DN X GI LGNFEKHPKEL IRGPI WLRG 
WKGNELQRCIRKRKMVGSRMFADDLHNLNKRIRYLYKHFNRHGK 
FR* KRKLRTSEKAHLS PWRRETVLFPVRKRLCI FS VI KWGFFG I 


6180 


156 


1833 


DHH I LKAAS TTHVCARGN I FAI PNTRCLEC * ATAT P S S LECQN * 
S HLS LC P L P ATTSGLT PNSM I P E KERQNI AERLLRVM CADLGAL 
SWSGKEFLKLAOTLVTDSGARYGAFSVTEILGNFNTLALKHDPR 
MY^QVKVKVTCALGSNACLGIGVTCHSQS VGPDSCY I LTAYQAE | 
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corresponding 
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amino acid 
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Predicted end 
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amino acid 
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sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F^Phenyl alanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L*Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








GNH I KS Y VLG VKGAD I RDS GDLVHHWVQNVLS EF VM SE IRTVYV 
TDCRVSTSAFSKAGMCLRCSACALNSWQSVLSKRTLQARSMHE 
VIELLNVCEDLAGSTGLAKETFGSLEETSPPPCWNSVTDSLLLV 
HERYEQ I CEFYSRAKKMNLI QSLNKHLLSNLAAILTPVKQAVIE 
LSNESQPTLQLVLPTYVRLEKLFTAKANDAGTVSKLCHLFLEAL 
KENFKVHPAHKVAMILDPQQKLRPVPPYQHEEIIGKVCELINEV 
KE S WAE E ADFE PAAKKP RS AAVEN PAAQE DDRLG KNEVYDYLQE 
PLFQATPDLFQYWSCVTQKHTKLAKLAFWLLiAVPAVGARSGCVN 
MCEQALLI KRRRLLS PEDMNKLMFLKSNML 


6181 


169 


1032 


TRTLLSPVLLPGPRWKPWRRRPMGPLALPAWLQPRYRKNAYLFI 
YYLIQFCGHSWIFTNMTVRFFSFGKDSMVDTFYAIGLVMRLCQS 
VSLLELLHI YVGIESNHLLPRFLQLTERI I IliFWITSQEEVQE 
KYWCVLFVFWNLLDMVRYTYSMLSVIGISYAVLTWLSQTLWMP 

YLMMLF IGM YFTYSHLYS ERRDI LGI FP I KKKKM*S TAFQCDTR 
KDRLWIQCSK*NTGSILVEKFLVF 


6182 


1769 


1224 


AS* IDYQLNTLLKEFQLTEENTKLRYLTCSLIEDMAAAYFPDCI 
VRPFGSSVNTFGKLGCDLDMFLDLDETRNLSAHKISGNFLMEFQ 
VKNVPSERIATQKILSVLGECLDHFGPGCVGVQKILNARCPLVR 

CWARAHSLTSSIPGAWITNFSLTMMVIFFLQRRSPPILPTLDSL 
KT1ADAEDKCVI EGNNCTFVRDLSRI KPSQNTETLE LLLKEFFE 
YFGNFAFDKNSINIRQGREQNKPDSSPLYIQNPFETSLNISKNV 
SQSQLQKFVDLARESAWILQQEDTDRPSISSNRPWGLVSLLLPS 
APNRKSFTKKKSNKFAIETVKNLLESLKGNRTENFTKTSGKRTI 
STQT 


6183 


1118 


452 


HLDRYIKSPGSGSSTPAPPSHLLLYLLHPQSTRTMGCCGCSRGC 
GSGCGG CGS S CGGCX3SGCGGCG SGRGGCGSGCGGCSS S CGG CGS 
RCYVPVCCCKP VCSWVPACS CTS CGS CGGSKGGCGS CGGS KGGC 
GSCGCSQSSCCKPCCCSSGCGSSCCQSSCCKPCCCQSSCCVPVC 
CQSSCCKPCCCQSNCCVPVCCQCKI * GSGPRPSGFSCLVKAFLM 
VP 


6184 


1 


2191 


IVTVREEDGAPAVAPPGVWSRANKRSGAGPGGSGGGGARGAEE 
EPPPPLQAVLVADSFDRRFFPISKDQPRVLLPLANVALIDYTLE 
FLTATGVQETFVFCCWKAAQ I KEHLLKS KWCRPTSLNWRI I TS 
EL YRS LGD VLRD VDAKAL VRSDFLL VYGD VI SN INI TRALEEHR 
LRRKL * KNVS VMTM I FKE S S P SH PTRCHEDNVVVAVnciTTNR VI . 
HFQKTQGLRRFAFPLSLFQGSSDGVEVRYDLLDCHISICSPQVA 
' QLFTDNFD YQTRD D FVRGLLVNEE I LGNQIHMHVTAKE YGARVS 
NLHMYSAVCADVIRRWWPLTPEANFTDSTTQSCTHSRHNIYRG 
PEVSLGHGSILEENVl^SGTVIGSNCFITNSVIGPGCHIEPGD 
NWLDQT YLWQG VRVAAGAQ I HQS LL CDNAEVKER VTLXPRS VL 
T S OVWG PNI TLP EGS VI S LH P PD AE E DE DDGK F«; DDS R AHOF X 
DKVKMKG YNPAEVGAAGKG YLWKAAGMNMEEEEELQQNLWGLK r 
NMEEESESSSEQSMDSEEPDSRGGSPQMDDIKVFQNEVLGTLQR 
G KE EN I S CDNLVLE I NS LKYAYN I SLKE VMQVLS HVVLE FPLQQ 
MDS PLDS S R YCALLLPLLKAWS P VFRNY I KRAADHLEALAAI ED 
F FLEHEALG I SMAKVLMAF YQLE I LAE ET I LS W FS QRDTTDKGQ 
QLRKNQQLQRFIQWLKEAEEESSEDD 


6185 


791 


44 


PCTS CVLWATLHLPASTRKAPQAECGM I S I TEWQKI G VG ITG FG 
IFFILFGTLLYFDSVLLAFGNLLFLTGLSLIIGLRKTFWFFFQR 
HKLKGTSFLLGGWIVLLRWPLLGMFLETYGFFSLFKGFFPVAF 
GFL^NVCNIPFIX5ALFRRLC<;TSSMV*KTEMSSLNLDHWLKGAK 
REEWEPPPQSPALTHSPTYPGPPQVQKERNGAEQLTSNPQVDSR 
GCQE AEMQT P RRLG WGW YHTLT LYLWEE K 


6186 


569 


238 


VYGIDSSNTNTHGAEERNRKLKKHWKLCHAQSRLDVNGLALKMA " 
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Predicted end 
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Amino acid segment containing signal peptide 
(A»Alanine, OCysteine, D=Aspartic Acid, E = 
Glutamic Acid, F= Phenyl alanine, G=Glycine , 
H=Histidine, I«Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S= Serine, T=Threonine, V« Valine, 
W= Tryptophan, Y=Tyrosine, X«Un)cnown, *-Stop 
Codon, /=po6sible nucleotide deletion, 
\*=possible nucleotide insertion) 








KERKVKNKVKNKADTEEVFNNSPTNQEKMPTSAILPDFSGSVIS 
NIRNQMETLHSQPHQEENLCFENSFSLINLLPINAVEPTSSQQI 
PNRETS EANKERRKMTSKS SESNI YS PLTS F ITADSELHD I I KD 
LE DCLMVG LHTCG DLAPNTLR I FTSN S E 1 KG VCS VGCC YHLLS E 
EFENQHKERTQEKWGFPMCHYLKEERWCCGRNARMSACLALERV 
AAGQG L P TE S LFYRAVLQD 1 1 KDC YG I TKCDRHVG KIYSKCSSF 
LDYVRRSLKKLGLDESKLPEKI IMNYYEKYKPRMNELEAFNMLK 
WLAPCIETLILLDRLCYLKBQEDIAWSALVKLFDPVKSPRCYA 
VI ALKKQQ* FPLKQI IRCISL* DSAGCAEEVS VGDGGPAliRDAP 
PSGSRVGSRYD 


6187 


1701 


771 


DAWGPETRLARI LNPDS FI E PRPGRLP ELEATRPHMEPKAS C PA 
AAPLMERKFHVLVGVTGSVAALKLPLLVSKLLDIPGLEVAVVTT 
ERAKHFYSPQDIPVTLYSDADEWEMWKSRSDPVLHIDLRRWADL 
LLVAPLDANTLGKVASG I CDNLLTCVMRAWDRS KPLLFCPAMNT 
AMWEHP I TAQQVDQLKAFGYVE I PCVAKKLVCGDEGLGAMAE VG 
TIVDKVKEVLFQHSGFQQS*PGISVMGVPLYSEWVQAKSVKMDV 
GKIGGYPHLL.NGGPALSLPRGQACSRLNWTEGPGLSFFQPGEAA 
A 


6188 


238 


1534 


KGFVNAGPLMAELQVSPQWKAPEMSQICLSCGHPSA*GPRWASW 
N 1 GVF I C I RCAG IHRNLGVH I SR VKS VNLDQWTQEQ I Q CMQEMG 
NGKANRLYEAYLPETFRRPQIDPAVEGFIRDKYEKKJCYMDRSLD 
INAFRKEKDDKWKRGSEPVPEKKLEPWFEKVKMPQKKEDPQLP 
RKSS PKSTAPVMDLLGLDAPVACS IANSKTSNTLEKDLDLLASV 
PSPSSSGSRKWGSMPTAGSAGSVPENLNLFPEPGSKSEEIGKK 
Q LS KDS I LS L YGSQT PQM PTQ AM FMAP AQMAY P TAY P S F PG VT P 
PNS I MGSMMPP P VGMVAQPGASGMVAPMAMPAGYMGGMQASMMG 
VPNGMMTTQQAGYMAGMAAMPQTVYGVQPAQQLQWNLTQMTQQM 
AGMNFYGANGMMNYGQSMSGGNBQAANQTLS PQMWK 


6189 


1297 


793 


LGEPLGDLCELIPGDVQQLQMGEVHPGTGAQGSAAQSVAGEVQL 
TQLSHARQRPSCQGSQLIALDLQHMDISRQPRWQHVQPVARQVQ 
RAQQAQLAEGVAVHLWAGDAWAEVELLQEVGGGKVFAANACDL 
WQDHEGA^AARQATGHALQRVIVQVRRVQPLEAL*RVPSGLPR 
RVRAFMILHNQI TGIGRED FATTY FLEELNLS YNRITS PQVHRD 
AFRKLRLLRSIiDLSGNRLHMLPPGLPRNVHVLKVKRNELAAIiAR 
GALAGMAQLRELYLTSNRLRSRALGPRAWVDLAHLQLLD IAGNQ 
LTEI PEGLPESLEYLYLQNNKISAVPANAFDSTPNLKGI FLRFN 
KLAVG S WDSAFRRL KHLQ VLD I E GNL E FGD I S KDRGRLGKEKE 
EEEEDEVEEEETR 


6190 


66 


1309 


I LVGNVSFLLS FAE YVCNCS WGS liNVNR CNQTTGQCE CRPGYQ 
GLHCETCKEGF YLNYTSGLCQPCDCS PHGALS I PCNSSGKCQCK 
VGV I G S ICDRCQDG YYGFS KNGCL P CQCNNRS AS CD ALTG ACLN 
CQENS KGNHCE E C KEG F YQS PDAT KE CLRCP CS AVTSTG S CS I K 
SSELEPECDQCKDGYIGPNCNKCENGYYNFDSICRKCQCHGHVY 
PVKTPKICKPESGECINCLHNTTGFWCENCL*GYVHDIiEGNCIK 
KVILPTPEGSTILVSNASLTTSVPTPVINSTFTPTTLQTIFSVS 
TSENSTSALADVSWTQFNI 1 1 LTV 1 1 1 WVLLMGFVGAVYMYRE 
YQNRKLNAPFWT I ELKEDN I S FSSYHDS I PNAD VSGLLEDDGNE 
VAPNGQLTL TTP IHNYKA 


6191 


1212 


1511 


VNLCHGGLLHLSTHHLGIKPSMH*LFFLMLSFPHLTPQQPKCPS 
MIDWIKKIWYIYTMEYYATIKRNEIMFFAGTWMEMEAIILSKLM 
QDYMFSLISGS 


6192 


3 


950 


TRGCGNKMAGKKNVLS SLAVYAEDSEPESDGEAG I EAVGS AAEE 
KGGLVSDAYGEDDFSRLGGDEDGYEEEEDENSRQSEDDDSETEK 
PEADDPKDNTEAEKRDPOELVAS FSERVRNMSPDE I KI PPEPPG 
RCSNHLQDKIQKLYERKIKEGMDMNY 1 1 QRiCKEFRNPS I YEKL I 
QFCAIDELGTNYPKDMFDPHGWSEDSYYEALAKAQKIEMDKLEK 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E« 
Glutamic Acid, F»Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, KaLysine, 
L-Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y« Tyro sine , X=UnJoaown, +-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








AK KE RT K I E FVTG T KKGTTTNATS TTTTTAS TAVADAQ KRKS KW 
DS AI PVTTIAQ PT I LTTTATLPAWTVTTSASGS KTTVI SAVGT 
IVKKAKQ 


6193 


3 


950 


TRG CGN KMAG K KNVLS S LAVYAED S E PESDGEAG IEAVGS AAEE 
KGGLVSDAYGEDDFSRLGGDEDGYEEEEDENSRQSEDDDSETEK 
PEADDPKDNTEAEKRDPQELVASFSERVRNMSPDEIKIPPEPPG 
RCSNHLQDKIQKLYERKI KEGMDMNYI IQRKKEFRNPS I YEKLI 
QFCAIDELGTNYPKDMFDPHGWSEDSYYEALAKAQKIEMDKLEK 
AKKERTKI E FVTGTKKGTTTNATS TTTTTAS TAVADAQ KR KS KW 
DS AI P VTT I AQ P T I LTTTATLPAWT VTTS AS G S KTTV I SAVGT 
IVKKAKQ 


6194 


3 


950 


TRG CGNKMAG KKNVL S S LAVYAEDS E PES DG E AG I E AVGS AAEE 
KGGLVSDAYGEDDFSRLGGDEDGYEEEEDENSRQSEDDDSETEK 
PEADDPKDNTEAEKRDPQELVASFSERVRNMSPDEIKIPPEPPG 
RCSNHLQDKI QKLYERKI KEGMDMNY 1 1 QRKKE FRNPS I YEKLI 
QFCAIDELGTNYPKDMFDPHGWSEDSYYEALAKAQKIEMDKLEK 
AKKERTKIE FVTGTKKGTTTNATS TTTTTASTAVADAQKRKSKW 
DS A I P VTT I AQPT ILTTTATL PA VVTVTTS AS G S KTTV I SAVGT 
IVKKAKQ 


6195 


736 


235 


VANGLQSNM P KF YCDYCDT YLTHDS PS VRKTHCSGR KHKENVKD 
YYQKWMEEQAQSL I DKTTAAFQQGKI P PTPFS APP PAGAMI PPP 
PSLPGPPRPGMMPAPHMGGPPMMPMMGPPPPGMMPVGPAPGMRP 
PMGGHMPMMPGPPMMRPPARPMMVPTRPGMTRPDR 


6196 


1512 


623 


KTGKRRSAAYVRNILDNAEQVISNIiEARNLGPRLTPLLQEEDSH 
QRLLMGLMVS EL KDHFLRHLQGVE KKKI EQMVLD YI S KLLDL I C 
H I VETNWRKHNLH S W VLHFNS RGS AAE FAVFH I MTR I LE ATNS L 
FLPLPPGFHTLHTILGVQCLPLHNLLHCIDSGVLLLTETAVIRL 
MKDLDNTEKNEKLKFSIIVRLPPLIGQKICRLWDHPMSSNIISR 
NHVTRLLQNYKKQPRNSMINKSSFSVEFLPLNYFIE ILTD IES S 
NQALYPFEGHDNVDAEFVEEAALKHTAMLLGL 


6197 


3 


819 


ADPEGTEEAVMS RYTRPPNTS LFI RNVADATRPEDLRREFGRYG 
P I VDVYI PLDFYTRR PRGFAYVQFEDVRDAEDALYNLNRKWVCG 
RQIEIQFAQGDRKTPGQMKSKERHPCSPSDHRRSRSPSQRRTRS 
RS S S WGRNRRRS DS LKESRHRR FSYSQSKSRSKSL PRRSTS ARQ 
SRTPRRNFGSRGRSRSKSLQKRSKSIGKSQSSSPQKQTSSGTKS 
RSHGRHSDSIARSPCKSPKGYTNFETKVQTAKHSHFRSHSRSRS 
YRHKNSW 


6198 


111 


1912 


SEAALSPSFISPACFLLRKLPALSDGTLPHPDTLGMNYEGARSE 
RENHAADDS EGGALDMCCSERL PGLPQP I VMEALDEAEGLQDSQ 
REMPPPPPPSPPSDPAQKPPPRGAGSHSLTVRSSLCLFAASQFL 
LACGVLWFSGYGHIWSQNATNLVSSLLTLLKQLEPTAWLDSGTW 
GVPSLLLVFLSGGLVLVTTLVWHIjLRTPPEPPTPLPPEDRRQSV 
S RQ PS FT YS E WMB E K I EDDFLDLD P VPB TP VFDCVMD I KPEADP 
TSLTVKSMGLQERRGSNVSLTLDMCTPGCNEEGFGYLMSPREES 
AREYLLSASRVLQAEELHEKALDPFLLQAEFFEIPMNFVDPKEY 
DIPGLVRKNRYKTILPNPHSRVCLTSPDPDDPLSSYINANYIRG 

yggeek^iatqgpivstvadfwrmvwqbhtpiivmitnieemn 

E KCTE Y W P E E Q VA YDG VE I TVQKV I HTEDYRLRLI S LKSGTEE R 
GLICHYWFTSWPr>2KTPDRAPPLLHLVREVEEAAQQEGPHCAPII 
VHCS AG IGRTGCF I ATS I CCQQ LRQEG WD I LKTTCQLRQDRGG 
M I QH CEQ YQ FVHHVMS LYEKQLSHQ S P E 


6199 


144 


1211 


MARENGES SSS WKKQAEDI KK I FEFKETLGTGAFS E WLAEEKA 
TGKLFAVKCIPKKALKGKESSIENEIAVLRKIKHENIVALEDIY 
ES PNHLYLVMQLVSGGELFDR IVEKGFYTEKDASTLIRQVLDAV 
YYLHRMGIVHRDLKPENLLYYSQDEESKIMISDFGLSKMEGKGD 
VMSTACX3TPGYVAPEVLAQKPYSKAVDGWSIGVIAYILLCGYPP 
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Amino acid segment containing signal peptide 
{A-Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M*Methionine, N*=Asparagine, 
P^Proline, Q-Glutamine, R=Arginine, 
S=Serine, T= Threonine, V= Valine, 
W=Tryptophan, Y«Tyrosine, X-Unknovm, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FYDENDS KLFEQILKAEYEFDSPYWDDI SDSAKDFI RNLMEKDP 
NKRYTCEQAARHPWIAGDTALNKNIHESVSAQIRKNFAKSKWRQ 
AFNATAVVRH^KIjHLGSSLDSSNASVSSSLSLASQKDCASGTF 
HAL* 


6200 


702 


96 


LPEVPHSLRPRVKPHLCCAQPAVRVMARLPKLAVFDLDYTLWPF 
WVDTHVD P P FHKS SDGTVRDRRGQD VRL Y P E VPEVLKRLQS LG V 
PGAAASRTSEIEGANQLLELFDLFRYFVHREIYPGSKITHFERL 
QQKTGIPFSQMIFFDDERRNIVDVSKLGVTCIHIQNGMNLQTLS 
QG LETFAKAQTGP LRSS LE ES P FE A 


6201 


2809 


2383 


GQT PR VR W KMRRS LRAGKRRQTAGRKSKS P P KVP I VI QDDSLPA 
GP PPQ I R I LKRPTSNGWSS PNS TSR PTL P VKSLAQRE AEYAEA 
RKRILGSASPEEEQEKPILDRPTRISQPEDSRQPNNVIRQPLGP 
DGSQGFKQRR 


6202 


2 


426 


INADRAAVASSLLSRPTRKMAPQKDRKPKRSTWRFNLDLTHPVE - 
DGIFDSGNFEQFLREKVKVNGKTGNLGNWHIERFKNKITWSE 
KQFSKRYLKYLTKKYLKKNNLRDWLRWASDKETYELRYFQISQ 
DEDESESED 


6203 


419 


2550 


RCPRPPATAGAAASRPDRSPPSG I SGSEAAAGAGAAAPASQHPA 
TGTGAVOTEAMKQ I LGVI DKKLRNLEKKKGKLDD YQERMNKGER 
LNQDQLDAVS KYQEVTNNLiEFAKELQRS FMALSQDIQKTI KKTA 
RRE Q LMRE EAEQKRLKTVLELQ YVLD KLG DDE VRTD LKQGLNG V 
PILSEEELSLLDEFYKLVDPERDMSLRLNEQYEHASIHLWDLLE 
GKE KP VCGTT YKVLKE I VER V FQ S NY FDSTKNHQNGL CEEE E AA 
SAPAVEDQVPEAEPEPAEEYTEQSEVESTEYVNRQFMAETQFTS 
GEKEQVDEWTVETVEVVNSLQQQPQAASPSVPEPHSLTPVAQAD 
PLVRRQRVQDLMAQMQGPYNFIQDSMLDFENQTLDPAIVSAQPM 
NP TQNMDM PQLVCP P VHS E S RLAQ PNQ VP VQP EATQV PLVS STS 
EG YTASQ PLYQPSHATEQRPQKE P I DQIQAT I SLNTDQTTASSS 
LPAASQPQVFQAGTS KPLHS SG I NVNAAPFQSMQTVFNMNAPVP 
PVNEPETLKOQNQYQAS YNQS FS S QPHQ VEQTELQQEQLQTWG 
T YHG S PDQSHQ VTGNHQQ P PQQNTG F PR SNQ P YYNSRGVS RGGS 
RG ARGLMNG YRG PANG FRGG YDG YRP S FSNTPNSG YTQS QFS AP 
RD YSG YQRDG YQQNFKRGS GQ SG P RG APRGRGG P PR PNRGMP QM 
NTQQVN 


6204 


2933 


787 


CTHNL I SLLGGRALIHFNRFLNLK I QEGEAHN I FCPAYDCFQLV 
PGDI IKSWSKEMDKRYLQFDIKAFVENNPAIKWCPrPGCDRAV 
RLTKQG SNTSGS DTLS FPL LRAP AVD CGKGHL FCWECLGE AHE P 
CDCQTWKNWLQKI TEMKPEELVGVS EAYEDAANCLWLLTNS KP C 
ANCKSPIQKNEGCNHMQCAKCKYBFCWICLEEWKKHSFVHWEVI 
YRCTRYEVIQHVEEQSKEMTVEAEKKHKRFQELDRFMHYYTRFK 
NHEHSYQLEQRLLKTAKEKMEQLSRALKETEGGCPDTTFIEDAV 
HVLLKTRRILKCS YPYGFFLEPKSTKKE I FELMQTDLEMVTEDL 
AQKVNRP YLRTPRHKI IKAACLVQQKRQE FLAS VARGVAPADS P 
EAPRRSFAGGTWDWEYLGFAS PEEYAEFQYRRRHRQRRRGDVHS 
LLSNPPDPDEPSESTLDIPEGGSSSRRPGTSWSSASMSVLHSS 
SLRD YT PASRS ENQDS LQALS S LDEDDPN I LLAI QLS LQESGLA 
LDEETRDFLSNEASLGAIGTSLPSRLDSVPRNTDSPRAALSSSE 
LLELGDSLMRLGAENDPFSTDTLSSHPLSEARSDFCPSSSDPDS 
AGQDPN I NDNLLGNI MAWFHDMNPQS IAL I PPATTE I SADSQLP 
CI KDGSEGVKDVELVLPEDSMFEDAS VSEGRGTQ1 EENPLEBNI 
PGGGKQHPQAW 


6205 


1 


1200 


RAHRGKMALE VGDMEDGQLSDSDS DMTVAPSDRPLQL P KVLGGD 
S AMRAFQNTATACAP VSHYRAVES VDS SEES FSDSDDDS CLWKR 
KRQKCFNPPPKP E P FQFGQS SQKP P VAGGKKINNI WGAVLQEQN 
QDAVATE LG I LGMEGT I DRS RQS ET YNYLLAKKLRKE SQEHTKD 
LDKELDEYMHGGKKMGSKEEENGQGHLKRKRPVKDRLGNRPEMN 
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Amino acid segment containing signal peptide 
{A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V- Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








YKGRYE I TAEDSQEKVADE IS FRLQE PKKDL I AR WRI IGNKKA 
I E LLME TAE VEQNGGLFI MNGS RRRT PGGVFLNLLKNTP S I S E E 
Q I KD I F Y I ENQKE YENKKAARKRRTQVLGKKMKQAI KSLNFQED 
DDTSRETFASDTNEALASLDESQEGHAEAKLEAEEAI EVDHSHD 
LDIF 


6206 


10 


1442 


1 1 S ERRERS CLHLVC I RCS CDWEMGS VLGLCSMASN I PCLCGS 
AP CLLCRCCPSGNNSTVTRIiI YALFLLVGVCVACVML I PGMEEQ 
LNKI PGFCENEKG WPCNI LVGYKAV YRLCFGLAMF YLLLS LLM 
I KVKS S SDPRAAVHNGFWFFKFAAAI AI 1 1 GAFF I PEGTFTTVW 
FYVGMAGAFCFILIQLVLLIDFAHSWNESWVEKMEEGNSRCWYA 
ALLSATALN YLLS LVAI VLF FVYYTHPAS CS ENKAF I S VNMLLC 
VGASVMSILPKIQESQPRSGLLQSSVITVYTMYLTWSAMTNEPE 
TNCNPSLLS IIGYNTTSTVPKEGQSVQWWHAQGI IGLILFLLCV 
FYSSIRTSNNSQVNKLTLTSDESTLIEDGGARSDGSLEDGDDVH 
P AVDNFRDG VT YS YS FFHFMLFL.AS L Y T MMTLTNW YR YE PS RE M 
KSQWTAVWVKISSSWIGIVLYVWTLVAPLVIiTNRDFD 


6207 


2924 


1471 


TVMAEAATPGTTATTSGAGAAAATAAAAS PTPI PTVTAPS LGAG 
GGGGGSDGSGGGWTKQVTCRYFMHGVCKEGDNCRYSHDLSDSPY 
S WCKY FQRG YCI YGDRCR YEHS KPLKQEEATATBLTTKSSLAA 
S S SLSS I VG PLVEMNTGEAES RNSNFATVGAGS EDW VNAI E FVP 
GQPYCGRTAPSCTEAPLQGSVTKEESEKEQTAVETKKQLCPYAA 
VGE CRYGEN CVYLHGDS CDMCG LQVLHPMDAAQRS Q H I KS C I EA 
HEKDMELSFAVQRSKDMVCGICMEWYEKANPSERRFGILSNCN 
HTYCLKCIRKWRSAKQFESKIIKSCPECRITSNFVIPSEYWVEE 
KEEKQKL I LKYKEAMSNKACRYFDEGRGS CPFGGKCFYKHAYPD 
n p p r p pop n waT<i <? p vt? adr p nw fwp t . t f f r pn qmp pdnti pf k 
WTFELGEMLLMLLAAGGDDELTDSEDEWDLFHDELEDFYDLDL 


6208 


2924 


1471 


TVMAEAATPGTTATTSGAGAAAATAAAAS PTPI PTVTAPS LGAG 
GGGGGSDG S G GG WTKQVTCR YFMHG VC KEGDNCRYS HDLS DS P Y 
S WCKYFQRG YCI YGDRCR YEHS KPLKQEEATATELTTKSSLAA 
S S S LS S rVGP LVEMNTGE AES RNS N FATVGAGS EDWVNAI E FVP 
GQ P YCGRTAP S CTEAPLQGS VTKE E S E KE QTAVETKKQ LC P YAA 
VGECRYGENCVYLHGDSCDMCGLQVLHPMDAAQRSQHIKSCIEA 
HE KDMELS FAVQR SKDMVCG I CME WYEKANP S ERRFG I LSNCN 
HTYCLKC IRKWRSAKQFES KI I KS CPECRITSNFVI PS EYWVEE 
KE E KQ KL I LK YKEAMSNKACR YFD EGRGS CP FGGNC F YKHAYPD 
GRREE PQRQKVGTSSRYRAQRRNHFWEL I EERENSNP FDNDEEE 
VVT FE LGEMiiLMLLiAAGGDDELTDS E DEWDL FHDE LE D F YDLDL 


6209 


1758 


829 


ERLCFPCMQSKIYSYMSPNKCSGMRFPLQERNSVTHHEVKCQGK 
PLAGIYRKREEKRNAGNAVRSAMKSEEQKIKDARKGPLVPFPNQ 
KS EAAE P PKT P PS SCDSTNAAIAKQALKKP I KGKQAPR KKAQGK 
TQQNRKLTDFYPVRRSSRKSKAELQSEERKRIDELIESGKEEGM 
KIDLIDGKGRGVIATKQFSRGDFWEYHGDLIEITDAKKREALY 
AQDPSTGCYMYYFQYLSKTYCVDATRETNRLGRLINHSKCGNCQ 
TKLHDIDGVPHLILIASRDIAAGEELLYDYGDRSKASIEAHPWL 
KH 


6210 


3761 


387 


I FGM S KLRM VLLEDSGS AD FRRH FVNLS P FT I T WLLLS AC FVT 
SSLGGTDKELRLVDGFJnCCSGRVEVKVQEEWGTVCNNGWSMEAV 
S V I CNQLGCPTAI KAPGWANSS AGSGRI WMDHVS CRGNESALWD 
CKHDGWGKHSN CTHQQDAG VT CSDGS NLBMRLTRGGNMCSGR I E 
I KFQGRWGXVCDDNFNIDHAS VI CRQ LE CGSAVS FS GSSN FGEG 
SGP I W FDDL I CNGNE S ALWNCKHQG WGKHNCDHAEDAGVI CS KG 
ADLSLRLVDGVTECSGRLEVRFQGEWGTICDDGWDS YDAAVACK 
QLGCPTAVTAIGRVHASKGFGHIWLDSVSCQGHEPAVWQCKHHE 
WGKHYCNHNEDAGVTCSDGSDLELRLRGGGSRCAGTVEVEIQRL 
LGKVCDRGWGLKEADWCRQLGCGSAIiKTSYQVYSKIQATNTWL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A~Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G*Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q*Glutamine, R=Ar-ginine, 
S=Serine, TsThreonine , V-Valine, 
W= Tryptophan, Y-Tyrosine, X=Un)cnown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FLSSCNGNETSLWDCKNWQWGGLTCDHYEEAKITCSAHREPRLV 
GGDIPCSGRVEVKHGDTWGSICDSDFSLEAASVLCRELQCGTVV 
S ILGGAHFGEGNGQ IWAEEFQCEGHESHLSLCPVAPRPEGTCSH 

I E DAHVLCQQLKCG VALSTP GGAR FGKGNGQ I WRHM FHCTGTEQ 
HMGDCP VTALGAS LCPSEQVAS VI CSGNQSQTLSSCNS S SLGPT 
RPTIPEESAVACIESGQLRLVNGGGRCAGRVEIYHEGSWGTICD 
DSWDLSDAHWCRQLGCGEAINATGSAHFGEGTGP I WLDEMKCN 
GKESR I WQCHSHGWGQQNCRH KEDAGVI CS E FMSLRLTSEASRE 
ACAGRLEVF YNGAWGTVGKSS MS ETTVGWCRQLGCADKGKINP 
ASLDKAMS I PMWVDNVQCPKGPDTLWQCPSSPWEKRLASPSEET 
W I TCDNKI RLQEGPTS CSGRVE I WHGGS WGTVCDDS WDLDDAQV 
VCQQLGCGPALKAFKEAEFGQGTGPIWLNEVKCKGNESSLWDCP 
ARRWGHSECGHKEDAAVNCTD I S VQKTPQKATTGRS SRQSSFI A 
VG I LGWLLAI FVALFFLTKKRRQRQRLAVS S RGENLVHQIQ YR 
EMNS CLNADDLDLMNS SGGHSEPH 


6211 


3761 


387 


I FGM S KLRMVLLED S G S ADFRRH F VNLS P FT I T WLLLS ACF VT 
SSLGGTDKELRLVDGENKCSGRVEVKVQEEWGTVCNNGWSMEAV 
SVICNQLGCPTAIKAPGWANSSAGSGRIWMDHVSCRGNESALWD 
CKHDGWGKHSNCTHQQDAGVTCSDGSNLEMRLTRGGNMCSGRIE 
IKFOGRWGTVCDDNFN1DHASVICRQLECGSAVSFSGSSNFGEG 
SGPIWFDDLICNGNESALWNCKHQGWGKHNCDHAEDAGVICSKG 
ADLSLRLVDGVTECSGRLEVRFQGEWGTICDDGWDSYDAAVACK 
QLGCPTAVTAI GRVNAS KGFGH IWLDSVS CQGHEPAVWQCKHHE 
WGKHYCNHNEDAGVTCSDGSDLELRI»RGGGSRCAGTVEVEIQRL 
LGKVCDRGWGLKEADWCRQLGCGSALKTS YQVYS KIQATNTWL 
FLS S CNGNETSLWDCKNWQWGGLTCDHYEEAKITCSAHREPRLV 
GGDIPCSGRVEVKHGDTWGSICDSDFSLEAASVLCRELQCGTVV 
SILGGAHFGEGNGQIWAEEFQCEGHESHLSLCPVAPRPEGTCSH 
S RDVG WCS R YTE I RL VNGKT P CEGRVELKTLGAWGSL CNS HWD 
IEDAHVLCXJQLKCGVALSTPGGARFGKGNGQIWRHMFHCTGrEQ 
HMGDCPVTALGASLCPSEQVASVICSGNQSQTLSSCNSSSLGPT 
RPT I PEESAVACI ESGQLRLVNGGGRCAGRVE I YHEGSWGTI CD 
DS WD LSD AHWCRQLGCGEAINATGS AH FGEGTG PI WLDEMKCN 
GKESRIWQCHSHGWGQQNCRHKEDAGVICSEFMSLRLTSEASRE 
ACAGRLEVF YNGAWGTVGKS S MS ETTVG WCRQLG CADKGKINP 
ASLDKAMS I PMWVDNVQCPKGPDTLWQCPSSPWEKRLASPSEET 
W I TCDNKI RLQEG P TS CSGRVE I WHGGS WGTVCDDS WDLD DAQ V 

ARRWGHS ECGHKEDAAVNCTDI S VQKTPQKATTGRSSRQSS FIA 
VG I LGWLLAI FVALFFLTKKRRQRQRLAVSSRGENLVHQIQ YR 
EMN S CLNADDLDLMNS SGGHS E PH 


6212 


1 


1134 


LKWELRPGGAVWGTGRGAGTGAPRSCCCQTNPGPPSSLRRAFRR 
R EL P F P ACHE I GLGAE AG SG P P P AP AARE S RSRAME EEASS P G L 
GCSKPHLEFCLTLGITRILESSPGVTEVTIIEKPPAERHMISSWE 
QKNNCVM PEDVKNF YLMTNGFHMTWS VKLDEH 1 1 PLGS MAINS I 
SKLTQLTQSSMYSLPNAPTLADLEDDTHEASDDQPEKPHFDSRS 
VIFELDSCNGSGKVCLVYKSGKPALAEDTEIWFLDRALYWHFLT 
D TFTA YYRLL I THLGL PQWQ YAFTS YG ISP Q AKQRVS M YKP I T Y 
NTNLLTEETDSFVNKLDPSKVFKS KNKI VI PKKKGPVQPAGGQK 
GPSGPSGPSTSSTSKSSSGSGNPTRK 


6213 


1 


1134 


LKWELRPGGAVWGTGRGAGTGAPRSCCCQTNPGPPSSLRRAFRR 
RELPFPACHE IGLGAEAGSG P P PAPAARES RSRAME EEASS PGL 
GCSKPHLEKLTLGI TRILESS PGVTEVTI I EKPPAERHMISSWE 
QKNNCVM PEDVKNF YLMTNGFHMTWS VKLDEHI I PLGSMAINS I 
SKLTQLTQSSMYSLPNAPTLADLEDDTHEASDDQPEKPHFDSRS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F- Phenyl alanine, G=Glycine, 
H*Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T*Threonine, V= Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, **Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VIFELDSCNGSGKVCLVYKSGKPALAEDTEIWFLDRALYWHFLT 
DTFTAYYRLIilTHLGLPQWQYAFTSYGISPQAKQRVSMYKPITy 
NTNLLTEETDSFVNKLDPSKVFKS KNKIVI PKKKGPVQPAGGQK 
GPSGPSGPSTSSTSKSSSGSGNPTRK 


6214 


2 


460 


HE LAP S A I RRAARLG LG P ARWQ S RAAAFY FVRGFRTGWS FVGWV 
VLGTSAKRTRLFFFLSKMAASSRAQVLALYRAMLRESKRFSAYN 
YRT YAVRR I RDAFRENKNVKDP VE I QTLVNKAKRDLGVIRRQVH 
IGQLYSTDKLI IENRDMPRT 


6215 


2 


1849 


F VAGGPRGSGS AAETM P E I RVT PLGAGQDVGRS C I LVS IAGKNV 
MLDCGtfHMG FNDDRR F P D F S Y I TQNG RLTDFLDCVI I S HFHLDH 
CGALPYFSEMVGYDGP I YMTHPTQA1 CPILLEDYRKIAVDKKGE 
ANFFTSQMI KDCMKKWAVHLHQTVQVDDELE I KAYYAGHVLGA 
AMFQIKVGSESWYTGDYNMTPDRHLGAAWIDKCRPNLLITEST 
YATTIRDSKRCRERDFLKKVHETVERGGKVLIPVFALGRAQELC 
I LLE T FWERMNLKVP I Y FS TGLTE KANH YY K1»F I P WTNQKI RKT 
FVQRNM FEFKHI KAFDRAFADNPG PMWFATPGMLHAGQS LQI F 
RKWAGNEKNMVI MPGYCVQGTVGHKI LSGQRKLEMEGRQ VLEVK 
MQ VE YMS FS AHADAKG I MQ L VGQAE P ES VLLVHGEAKKME FLKQ 
KIEQELRVNCYMPANGETVTLPTS PS I PVGISLGLLKREMAQGL 
LPEAKKPRLLHGTLIMKDSNFTUjVSSEQALKEI^LAEHQLRFTC 
RVHLHDTRKEQ E TALRVY S HL KS VL KDH CVQHL P DG S VTVE S VL 
LQAAAPSEDPGTKVLLVSWTYQDEELGSFLTSLLKKGLPQAPS 


6216 


11 


393 


QTTRPEPRNSALRQSRSKMAWGVSSVSRLLGRSRPQLGRPMSS 
GAHG E EG S ARMW KTLTFFVALPG VAVS MLNVYLKS HHGEHERP E 
FIAYPHLRIRTKPFPWGDGNHTLFHNPHVNPLPTGYEDE 


6217 


9 


1178 


TRVGRGESGLKMEVKPPPGRPQPDSGRRRRRRGEEGHDPKEPEQ 
LRKLFIGGLSFETTDDSLREHFEKWGTLTDCVVMRDPQTKRSRG 
FGFVT YS CVE EVDAAMCARPHKVDGRWE P KRAVSREDS VKPGA 
HLTVKKI FVGGI KEDTEEYNLRDYFEKYGKIETIEVMEDRQSGK 
KRGFAFAH'FTIDHDTVDKIWQKYHTINGHNCEVKKALSKQEMQS 
AGSQRGRGGGSGNFMGRGGNFGGGGGNFGRGGNFGGRGGYGGGG 
GGSRGSYGGGDGGYNGFGGDGGNYGGGPGYSSRGGYGGGGPGYG 
NQGGGYGGGGGYDGYNEGGNFGGGNYGGGGNYNDFGNYSGQQQS 
NYGPMKGGSFGGRSSGSPYGGGYGSGGGSGGYGSRRF 


6Z18 


1305 


906 


S CERRGFI MADDLKRFLYKKLPS VEGLHAI WSDRDG VPVI KVA 
NDNAPEHALRPGFLSTFAIiATDQGSKLGLSKNKS 1 1 CYYNTYQV 
VQFNRLPLWS F IASSSANTGLI VSLEKELAPLFEELRQWEVS 


6219 


2 


890 


AGPGEGAGAGTRCAGAEAEMASAGGEDCESPAPEADRPHQRPFL 
IGVSGGTASGKSTVCEKIMELLGQNEVEQRQRKWILSQDRFYK 
VLTAE Q KAKALKGQ YN FDHPDAFDNDLMHRTLKN I VEGKTVE V P 
TYDFVTHSRI^ETTVVYPADVVLFEGILVFYSQEIRDMFHLRIiF 
VDTDSDVRLSRRVLRDVRRGRDLEQILTQYTTFVKPAFEEFCLP 
TKKYADVI I PRGVDNMVAINLI VQHIQDIUJGDICKWHRQGSNG 
RSYKRTFSEPGDHPGMLTSGKRSHLESSSRPH 


6220 


227 


764 


EQNI S LEMS CT I E KALAD AKALVERLRDHDD AAE S L I EQTTALN 
KRVEAMKQYQEEIQELNEVARHRPRSTLVMGIQQENRQIRELQQ 
ENKELRTS LE EHQS ALELIMS KYREQMFRLLMASKKDDPG 1 1 MK 
LKEQHSKIDMVHRNKSEGFFLDASRHILEAPQHGLERRHLEANQ 
NVH 


6221 


98 


916 


RWIWDLNPVSDGLELRPKYNGILHCLTTIWK1J3GLRGLYQGVTP 
NIWGAGLSWGLYFVFYNAIKSYKTEGRAERLEATEYLVSAAEAG 
AMTLC I TNPLWVTKTRLMLQYDAWNS PHRQ YKGMFDTLVK I YK 
YEGVRGLYKGFVPGLFGTSHGALQFMAYELLKLKYNQHINRLPE 
AQLSTVEYI SVAALSKI FAVAATYPYQWRARLQDQKMFYSGVI 
DVITKTWRKEGVGGFYKGIAPNLIRVTPACCITFWYENVSHFL 
LDLREKRK 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, OCysteine, D-Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K*Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S«Serine, T-Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


£222 


2 


2116 


MARELRALLLWGRRLRPLLRAPALAAVPGGKPILCPRRTTAQLG 
PRRNPAWSLQAGRLFSTQTAEDKEEPLHSIISSTESVQGSTSKH 
EFQAETKKLLDIVARSLYSEKEVFIRELISNASDALEKLRHKLV 
SDGQALPEMEIHLQTNAEKGTITIQDTGIGMTQEELVSNLGTIA 
RSGSKAFLDALQNQAEASSKIIGQFGVGFYSAFMVADRVEVySR 
S AAPGSLGYQWLS DGSGVF E I AEASG VRTGTKI I IHLKSDCKEF 
SSEARVRDWTKYSNFVSFPLYLNGRRMNTLQAIWMMDPKDVRE 
WQHEEFYRYVAQAHDKPRYTLHYKTDAPLWIRSIFYVPDMKPSM 
FDVSRELGSS VALYSRKVL I QTKATD I LPKWLRFIRGWDSED I 
PLNLSRELLQESALIRKIiRDVLQQRLI KFF IDQSKKDAEKYAKF 
FEDYGLFMREGIVTATEQEVKEDIAKLLRYESSALPSGQLTSLS 
E Y AS RM RAGTRN I Y YLCAPNRHLAEH S P YYE AMKKKDTE VLFC F 
EQFDELTLLHLREFDKKKLISVETDIWDHYKEEKFEDRSPAAE 
CLS E KE TEELMAWMRNVLGS R VTNVXVTLRLDTHPAMVTVLEMG 
AARH FLRMQQLAKTQE E RAQLLQ PTLE I NPRHAL I KKLNQLRAS 
EPGLAQLLVDQIYENAMIAAGLVDDPRAMVGRLNELLVKALERH 


6223 


3 


715 


DAWARTMAGMVDFQDEEQVKSFLENMEVECNYHCYHEKDPDGCY 
RL VDYLE G I RKNFD EAAKVLK FN CE ENQHSDS CY KLGAYYVTG K 
GGLTQDLKAAARCFLMACEKPGKKS I AACHNVGLLAHDGQVNED 
GQPDLGKARDYYTRACDGGYTSSCFNLSAMFLQGAPGFPKDMDL 
AC KYS MKACDLGH I WACANAS RM Y KLGDGVDKVEAKAEVLKNRA 
QQVHKEQQKGVQPLTFG 


6224 


1 


133 


LRTISSMAWGPLLLTLLAHCTGSWAQSVLTQPPSVSGARIPHBK 


6225 


3259 


938 


LLSCHRLAICKLPFSVESRKTVMGPQGARRQAFLAFGDVTVDFT 
OKEWRLLSPAQRALYREVTLENYSHLVSLGILHSKPELIRRLEQ 
GEVPWGEERRRRPGPCAGIYAEHVLRPKNLGLAHQRQQQLQFSD 
QSFQSDTAEGQEKEKSTKPMAFSSPPLRHAVSSRRRNSWEIES 
SOGQRENPTE I DKVLKGIENSRWGAFKCAERGQDFSRKMMVI IH 
KKAHSRQKLFTCRECHQGFRDES.ALLLHQNTHTGEKSYVCSVCG 
RG FS LKANLLRHQRTHSGE KP F L CKVCGRG YTS KS YLTVHERTH 
TGEKPYECQECGRRFNDKSSYNKHLKAHSGEKPFVCKECGRGYT 
NKS YFWHKR IHSGEKPYRCQECGRG FSNKSHL ITHQRTHSGEK 
PFACRQCKQS FSVKGSLLRHQRTHSGEKPFVCKDCERS FSQKST 
LVYHQRTHSGEKPFVCRECGQGFIQKSTLVKHQITHSEEKPFVC 
KD CGRGF I Q KSTFTLHQRTHS E E KP YGCRECGRRFRDKS S YNKH 
LRAHLGEKRFFCRDCGRGFTLKPNLT I HQRTHSGEKPFMCKQCE 
KS FS LKANLLRHQWTHSGERPFNCKDCGRGF I LKSTLLFHQKTH 
SGEKP FI CS ECGQGFIWKSNLVKHQLAHSGKQ PFVCKECGRGFN 
WKGNLLTHQRTHSGEKPFVCNVCGQGFSWKRSLTRHHWRIHSKE 
KPFVCQECKRGYTSKSDLTVHERIHTGERPYECQECGRKFSNKS 
YYSKHLKRHLREKRFCTGSVGEASS 


6226 


29 


266 


TKVSELLGGSQRLFFLPLWRRLCRCGLGPRVSPMAGPRVEVDGS 
IMEGGGQSLRVSTGLSWLLSLPWRAQRIRAGRSYA 


6227 


2581 


890 


MSASSLLEQRPKGQGNKVQNGSVHQKDGLNDDDFEPYLSPQARP 
NNAYT AM SDS YLPSYYSPSI GFS YS LGE AAWS TGGDTAM P YLTS 
YGQLSNGEPHFLPDAMFGQPGALGSTPFIiGQHGFNFFPSGIDFS 
AWGNNS SQGQS TQSSGYSSNYAYAPS S LGGAM I DGQSAFANETL 
NKAPGMNTIDOGMAALKLGSTEVASNVPKVVGSAVGSGS ITSNI 
VASNS L P PAT I AP P KPAS WAD I AS KP AKQQ PKLKTKNG I AGS S L» 
P P P P I KHNMD IGTWDNKGP VAKAP S Q AL VQNIGQ P TQGS PQP VG 
QQANNS P P VAQ AS VGQQTQPLPPPP PQ PAQLS VQQQAAQPTRWV 
APRNRGSGFGHNGVDGNGVGQSQAGSGSTPSEPHPVLEKLRSIN 
NYNPKDFDWNLKHGRVFI I KS YSEDD IHRS I KYNI WCSTEHGNK 
RLDAA YRS MNG KG P VYLLFS VNGSGH F CGVAEMKS AVD YNTCAG 
VWSQDKWKGRFDVRWI FVKDVPNSQLRH I RLEKNENKPVTNSRD 
TQEVPLEKAKQVLKI IAS YKHTTS I FDDFSHYEKRQ 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L«Leucine, M-Methionine, N=Asparagine, 
P=Proline, Q«Glutamine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X- Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 


6228 


47 


1978 


GRRCRRRGAVMELAQEARELGCWAVEEMGVPVAARAPESTLRRL 
CLGQGADIWAYILQHVHSQRTVKKIRGNLLWYGHQDSPQVRRKL 

QRRALLLRAQAGAMRRQQHTLRDPMQRLQNQLRRLQDMERKAKV 
DVT FGS LTS AALGL E P WLRD VRTACTLRAQFLQNL LL PQ AKRG 
SLPTPHDDHFGTSY^VOjSSVETLLTi^PPGHVLAALEHLAAER 
EAEIRSLCSGDGLGDTEISRPQAPDQSDSSQTLPSMVHLIQEGW 
RTVGVLVSQRSTLLKERQVLTORLQGLVEEVERRVLGSSERQVL 
ILGLRRCCLWTELKALHDQSQELQDAAGHRQLLLRELQAKQQRI 
LHWRQL VE E TQ EQ VR LT j I KGNSAS KTRLCRS PGE VLALVQRKW 

TVLPSIHQLHPASPRGSSFIALSHKLGLPPGKASELLLPAAASL 
RQDLLLLQDQRSLWCWDLLHMKTSLPPGLPTQELLQIQASQEKQ 
Q KEN LGQAL KR LE KL L KQ AL E R I P ELQG I VGD WWEQ PGQ AALS E 
ELCQGLSLPQWRLRWVQAQGALQKLCS 


6229 


1571 


560 


GPS LLGTRGTPNPARTLQ I F FL 1 1 GRRLTGRMAAVDDJjQ FEE FG 
NAATSLTANPDATTVNI EDPGETPKHQPGS PRGSGREEDDELLG 
NDDSDKTELLAGQKKSSPFWTFEYYQTFFDVDTYQVFDRIKGSL 
LPIPGKNFVRLYIRSNPDLYGPFWICATLVFAIAISGNliSNFLI 
HT XJFKTYHYVPFFR KV<Z T AAT T T YAYAWT iVPT. AT .WO FT .MWRN S V 
VMNIVSYS FLE I VCVYG YSLF I Y I PTAI LW 1 1 PHKAVRWI LVM I 
ALG I SGSLLAMTFW PAVREDNRRVALAT I VT I VLLKMLLS VGCL 
AY FFDAPEMDHLPTTTAT PNQTVAAAKS S 


6230 


1723 


600 


S KMSGRSGKKKMS KLS RSARAGV I FP VGRLMR YLKKGTFKYR I S 
VGAP V YMAAVI EYLAAE I LELAGNAARDNKKAR I APRHI LLAVA 

PPEKRGRKATSGKKGGKKSKAAKPRTSKKSKPKDSDKEGTSNST 
S EDGPGDG FT I LSS KS LVLGQKLSLTQSD I SHIGSMRVEG I VHP 
TTAE IDLKED IGKALE KAGGKE FLETVKELRKSQGPLEVAE AAV 
SQS SGLAAKFVIHCH I PQWGSDKCEEQLEET I KNCLSAAEDKKL 
KS VAFPPFPSGRNCF P KQTAAQVTLKAISAHFDDS S ASS LKNVY 
FLLFDSES IGI YVQEMAKLDAK 


6231 


149 


870 


LI FSS STMDRS LRNVLWS FGFLLLFTAYGGLQSLQSSL YS EEG 
LGVTALSrLYGGMLiLSSMFLPPLLIERLGCKGTIILSMCGYVAF 
SVGNFFASWYTLI PTS ILLGLGAAPLWSAQCTYLTITGNTHAEK 
AGKRGKDMVNQYFGIFFLIFQSSGVWGNLISSLVFGQTPSQETL 
PEEQLTSCGASDCLMATTTTNSTQRPSQQLVYTLIiGIYTGSGVL 
AVLM I AAFLQ P I RD VQRE S E 


6232 


3679 


1476 


FVAGTTMAGFWVGTAPLVAAGRRGRWPPOXJLMLSAAIjRTLKHVL 
YYSRQCLMVSRNLGSVGYDPNEKTFDKILVANRGEIACRVIRTC 
KKMGI KTVAIHSDVDAS SVHVKMADEAVCVGPAPTSKSYLNMDA 
IMEAIKKTRAQAVHP<3YGFLSENKEFARCLAAEDWFIGPDTHA 
I OAMGDKI F S KLLAKKAE VNTI PGFDG WKDAEE AVR TARE I G Y 
PVMI KASAGGGGKGMR I AWDDEETRDGFRLSSQEAASSFGDDRL 
LIEKFIDNPRHIEIQVLGDKHGNALWLNERECSIQRRNQKWEE 
APSI FLDAF/rRRAMGEQAVALARAVKYSSAGTVEFLVDSKKNFY 
FLEMNTRLQVEHP VTEC I TGLDLVQEM I RVAKG Y PLRHKQAD I R 
INGWAVECRVYAEDPYXSFGLPSIGRLSQYQEPLHLPGVRVDSG 
IQ1X5SDISIYYDPMISKLITYGSDRTEALKRMADALDNYVIRGV 
THNIALLREVIINSRFVKGDISTKFLSDVYPDGFKGHMLTKSEK 
NQLIJVIASSLFVAFQLRAQHFQENSRMPVIKPDIAIWELSVICLH 
DKVHTWASNNGS VFS VEVDGS KIjNVTSTWNLAS PLLS VS VDGT 
QRTVQCLSRFAGGNMSICFLGTVYK^ILTOLAAEI^KFMLEKV 
TEDTSSVLRSPMPGVWAVSVKPGDAVAEGQEICVIEAMKMQNS 
MTAGKTGTVKSVHCQAGDTVGEGDLLVELE 


6233 


1 


2654 


HSTREN^AGNFNFPSEGHLVRSTGPGGSFAKHMVAQCVSPKGP 
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SEQ 
ID 
NO: 


Predicted ( 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C«Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I*Isoleucine, K*Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
Paproline, Q*Glutamine, R=Arginine, 
S=Serine, T«Threonine, V^Valine, 
WaTryptophan, Y«Tyrosine, X= Unknown , *«Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








LACSRTYFFGATHVPYLGGDSKLPKKTEQIRLLSQIYAAVIEAV 
LAGIACYAKTSSLTKAKEVAEQTLGSGLDSFELIPFKAALRSKM 
T FH IHAVNNQGR I VPLDSEDSLS FVKTACMAVYD I PDLLGGNGC 
LGS WFS ES FLTSQ I LVKEKDGTVTTETSS WLTAAVPRFCS WL 
VEDNE VKLS E KTHQAVRGDE S FLGT YLTGGEGAYL YS SNLQSWP 
E EGNVH FFS SGLLFSHCRHGS 1 1 1 S KDHMNS I SFYDGDSTSTVA 
ALLIDFKSSLLPHLPVHFHGSSNFLMIALFPKSKIYQAFYSBVF 
SLWKQQDNSGISLKVIQEDGLSVEQKRLHSSAQKLFSALSQPAG 
EKRSSLKLLSAKLPELDWFLQHFAISSISQEPVMRTHLPVLLQQ 
AE I NTTHR I E S D KV IIS I VTGLPGCHAS E LCAFL VTLHKE CGRW 
M V YRQ I MDS S EC FHAAHFQR YL S S ALEAQQNRSARQ SAY I RKKT 
RLLWLQG Y TD V I D WQALQTHP D SNVKAS FT IG A I TAC VE PMS 
CYKEHRFLFPKCLDQCSQGLVSNWFTSHTTEQRHPLLVQLQSL 
IRAANPAAAFILAENGIVTRNEDIELILSENSFSSPEMLRSRYL 
MYPGWYEGKLNAGSVYPLWVQICVWFGRPLEKTRFVAKCKAIQS 
SIKPSPFSGNIYHILGKVKFSDSERTMEVCYNTLANSIiSIMPVL 
EGPTPPPDSKSVSQDSSGQQECYLVFIGCSLKEDS1KDWLRQSA 
KQKPQRKALKTRGMLTQQE IRS I HVKRHLEPLPAG YFYNGTQFV 
N FFGDKT DFH P LMDQ FMND YVE E ANRE I EKYNQELEQQEYHDLF 
ELKP 


6234 


1731 


404 


PRVREDMDHKSPGNKGSLVYAGIKSIVKSSLGMVESSRHNWSGL 
D KQ S DI QNLNE ER I LALQLCGW I KKG TD VDVGPFLNS L VQEGE W 
ERAAAVALFNLDIRRAIQILNEGASSEKGDI^NLNVVAMALSGYT 
DEKNSLWREMCSTLRLQLNNPYLCVMFAFLTSETGSYDGVLYEN 
KVAVRDR VAFACKFLS DTQLNR Y I EKLTNEMKEAGNLEG I LLTG 
LT KDGVD LME S YVDRTGDVQTAS YCMLQG S PLDVLKD ERVQYW I 
EOTRNLLDAWRFWHKRAEFDIHRSKLDPSSKPLAQVFVSCNFCG 
KS I S YSCSAVPHQGRGFSQYGVSGS PTKS KVTSCPGCRKPLPRC 
ALCLINMGTPVSSCPGGTKSDEKVDLSKDKKLAQFNNWFTWCHN 
CRHGGHAGHMLSWFRDHAECPVSACTCKCMQLDTTGNLVPAETV 
QP 


6235 


1 


571 


EKRDHRLPSWPRAALKVPGRGGRVGTTPELAAGGIMATRNPPPQ 
DYESDDDS YEVLDLTE YARRHQWWNRVFGHS SGPMVE KYSVATQ 
IVMGGVTGWCAGFLFQKVGKLAATAVGGGFLLLQIASHSGYVQI 
DWKRVEKDVNKAKRQ I KKRANKAAPE INNL I EEATEF I KQNIVI 
SSGFVGGFLLGLAS , 


6236 


1 


703 


WDQNKGAAAGSGLTLPSLPSARFSAGPPTQRSRPTMSNMEKHLF 
NLKFAAKELSRSAKKCDKSEKAEKAKIKKAIOKGNMEVARIHAE 
NAI RQKNQAVNFLRMSAR\nDAVAARVQTAVTMGKVTKSMAGVVK 
SMDATLKTMNLEKISALMDKFEHQFETLDVQTQQMEDTMSSTTT 
LTTPQNQVDMLLQEMADEAGLDLNMELPQGQTGSVGTSVASAEQ 
DELSQRLARLRDQV 


6237 


312 


720 


PTAMAEEGIAAGGVKDVNTALQEVLKTALIHDGLARGIREAAKA 
LD KRQ AHLC VLASNCDE PM YVKLVE ALCAEHQ I NLI KVDDNKKL 
GEWVGLCKIDREGKPRKWGCSCWVKDYGKESQAKDVIEEYFK 
CKK 


6238 


2 


4666 


EBVPTQESVKWEINVIIKNPEIVFVADMTKNDAPALVITTQCEI 
CYKGNLENS TMTAAIKDLQVRACP FLPVKRKGKI TTVLQPCDLF 
YQTTQKGTD PQV IDMSVKS LTLKVS PVI INTM IT I TSALYTTKE 
TIPEETASSTAHLWEKKDTKTLKMWFLEESNETEKIAPTTELVP 
KGEM IKMNI DS I FI VLEAGIGHRTVPWLLAKSRFSGEGKNWSSL 
INLHCQLELEVHYYNEMFGVWEPLLEPLEIDQTEDFRPWNLGIK 
MKKKAKMAI VESDPEEENYKVPE YKTVI S FHSKDQLN ITLSKCG 
LVMLNNLVKAFTE AATGS S ADFVKDLAPFM I LNS LGLTISVS PS 
DSFSVLNIPMAKSYVLKNGESLSMDYIRTKDNDHFNAMTSLSSK 
LFFI LLTPVNHSTADKI PLTKVGRRL YTVRHRESGVERS I VCQI 



471 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
lA»=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine , K>Lysine, 
L=Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine r V=Valine, 
W«Tryptophan, Y=Tyrosine, X»Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








DTVEGSKKVT I RS P VQ I RNHFS VPLS VYEGDTLLGTAS PENEFN 
IPLGSYRSFIFLKPEDENYQMCEGIDFEEIIKNDGALLKKKCRS 
KNPSKESFLINIVPEKDNLTSLSVYSEDGWDLPYIMHLWPPILL 
RNLLPYKIAYYIEGIENSVFTLSEGHSAQICTAQLGKARLHLKL 
LD YLNHDW KS E YH I KPNQQD I S FVS FTCVTEME KTDLD I AVHMT 
YNTGQTVVAFHSP YWMVNKTGRMLQYKADGIHRKHP PNYKKPVL 
FS FQPNHFFNNNKVQLMVTDSELSNQFS IDTVGSHGAVKCKGLK 
MDYQVGVTIDLSSFNITRIVTFTPFYMIKNKSKYHISVAEEGND 
KWLSLDLEQCIPFWPEYASSKLLIQVERSEDPPKRIYFNKQENC 
I LLRLDNE LGG I I AEVNLAEHSTVI TFLD YHDGAATFLL I NHTK 
NELVQYNQSSLSEIEDSLPPGKAVFYTWADPVGSRRLKWRCRKS 
HGEVTQKDDMMMPIDLGEKTIYLVSFFEGLQRIILFTEDPRVFK 
VTYES EKAELAEQE I AVALQDVG I S LVNNYTKQE VAYIG ITSSD 
WWETKPKKKARWKPMSVKHTEKLEREFKEYTESSPSEDKVIQL 
DTNVP VRLT PTGHNMK I LQ PHV I ALRRNYLP ALKVE YNTS AHQ S 
S FR IQI YRI QI QNQ IHGAVFP FVFYPVKPPKS VTMDSAPKPFTD 
VS I VMRSAGHSQISRI KYFKVIilQEMDLRLDLGFIYALTDLMTE 
AE VTENTE VELFHKD I EAFKE E Y KT AS LVDQS QVS L YE Y FHI S P 
I KLHLS VS L S S GREEAKDS KQNGGL I P VHS LNLLLKS I GATLTD 
VQ D WFKLAFFE LN YQ FHT TSDLQ S EVI RH Y S KQ AI KQM YVL I L 
GLDVLGNPFGLI REFSEGVEAFFYE PYQGAI QGPEE FVEGMALG 
LKALVGGAVGGLAGAAS KI TGAMAKGVAAMTMDED YQQ KRREAM 
N KQ PAG FREG I TRGGKGL VS G FVS G I TG I VT KP I KGAQ KGGAAG 
FF KGVGKGLVGAVARPTGG 1 1 DMAS STFQG I KRATETS EVES LR 
P P R FFNEDG VI R P YRLRDG TGNQM LQKIQ FYRE W I MTHS S S S DD 
DDDDDDDDESDLNH 


6239 


2108 


634 


KPGMAGKGS SGRRPLLLGLL VAVATVHLVI C P YTKVEES FNLQA 
TKDLLYHWQDLEQYDHLEFPGWPRTFLGPWIAVFSSPAVYVL 
SLLEMS KFYSQL I VRGVLGLGVI FGLWTLQKEVRRHFGAMVATM 
F CWVTAMQ FHLM F YCTRTL PNVIJUj P VVLLAIiAAWtiRHE WARF I 
WLSAFAI IVFRVELCLFLGLLLLLALGNRKVS WRALRHAVPAG 
ILCLGLTVAVDSYFWRQLTWPEGKVLWYNTVLNKSSNWGTSPLL 
WYFYSALPRGLGCSI^FIPLGLVDRRTHAPTVLiALGFMALYSLL 
PHKELRFI I YAFPMLNI TAARGCS YLLNNYKKSWLYKAGSLLVI 
GHLWNAAYSATALYVSHFNYPGGVAMQRLHQLVPPQTDVLLHI 
DVAAAQTGVSRFLQVNSAWRYDKREDVQPGTGM1AYTHILMEAA 
PGLIALYRDTHRVLASVVGTTGVSLNLTQLPPFNVHLQTKLVLL 
ERLPRPS 


6240 


2202 


1176 


HERGDSLKEPTSIAESSRHPSYRSEPSLEPESFRSPTFGKSFHF 
DPLSSGSRSSSLKSAQGTGFELGQLQSIRSEGTTSTSYKSLANQ 
TRNGSLSYDSLLTPSDSPDFESVQAGPEPDPPLGYTSPFLSARL 
AQQREAERHPRLVPTGPTHREPSPVRYDNLSRHIVASLQEREKL 
LRQSPPLPGREEEPGLGDSGIQSTPGSGHAPRTSSSSDDSKRSP 
LGKTP LGRPAVPRFGKPDGLRGRGVGS PEPG P TAP YLGRSMS YS 
SQKAQPGVSETEEVALQPLLTPKDEVQLKTTYSKSNGQPKSLGS 
AS PGPGQPP LS S PTRGG VKKVSGVGGTTYE I S V 


6241 


3 


1341 


RNAEEKKRLSLQREKI IARVS I DNRTRALVQALRRTTDPKLCIT 
RVEELTFHLLEFPEGKGVAVKERI I P YLLRLRQ I KDETLQAAVR 
EILALIGYVDPVKGRGIRILSIDGGGTRGWALQTLRKLVELTQ 
KPVHQLFDYICGVSTGAILAFMU3LFHMPLDECEELYRKLGSDV 
FSQNV I VGTVKMS WSHAFYDSQTWENI LKDRMGS ALMIETARNP 
TCPKVAAVSTIVNRGITPKAFVFRNYGHFPGINSHYLGGCQYKM 
WQAIRASSAAPGYFAEYALGNDLHQDGGLLLNNPSALAMHECKC 
LWPDVPLEC IVS LGTGRYES DVRNTVTYTS LKTKLSNVINS ATD 
TEEVH IMLDGLLPPDTYFRFNPVMCENI PLDESRNEKLDQLQLE 
GLKYI ERNEQKMKKVAKILSQEKTTLQKINDW IKLKTDMYEGLP 
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FFSKL 


6242 


198 


1310 


QHFLPGAETWSPGAAVCTARRFPGRSLAAFPRPAAPRRAVEMGE 
SSEDIDQMFSTLLGEMDLLTQSLGVDTLPPPDPNPPRAEFNYSV 
GFIODIiNESLNALEDQDLDALMADLVADISEAEQRTIQAQKESLQ 
NQHHSASLQAS I FSGAAS LGYGTNVAATG 3 SQ YEDDLP PPPADP 
VLDLPLPPPPPEPLSQEEEEAQAKADKIKLALEKLKEAKVKKLV 
VKVHMNDNSTKSLMVDERQLARDVLDNLFEKTHCIX^NVDWCLYE 
IYPELQIERFFEDHENWEVLSDWTRDTENKILFLEKEEKYAVF 
KN P QN FY LDNRG KKES KETNE KMNAKNKE S LLE VRL I LQSGRKE 
KDVCS I FKS FASENNGKI 


6243 


1509 


614 


RSASRFSGCWSRDSTCCCCPSTCWSRSSASCPRARWPPSSAPAT 
TS RAS SRRLACGPQTRAGAETRSTAMI RANS AARDTRRATCRSA 
AGTPSPTTMTCLTDVPTGCAAVEPTARLPAAAWASTITTGCCPA 
MGQAGAGPAGRKGSEAGGG PGRAHHAHPS PLPRE PRVRTGP PAH 
SPTPGSIDPSPELSWGSAGVTQESPLLDPVDFLLFRTRAVDPLR 
RVFFFFYQHLTFFSIQPQPPPCHAFHPRDPPAGTKRQLILVPLK 
GP P I LAP I LS LTP ILSRWS CYFPRSR IAQGWHIiS 


6244 


2119 


1745 


FEHAYASQFGTFLGNNESERCKLKLQQKTMSLWSWVNQPSELSK 
FTNPLFEANNLVIWPSVAPQSLPLWEGIFLRWNRSSKYLDEAYE 
EMVNI IEYNKELQAKVNILRRQLAELETEDGMQESP 


6245 


81 


114B 


LSLRNAKYSFPQELISLFSMTDLNDNICKRYIKMITNIVILSLI 
I CI S LAFW 1 1 SMTASTY YGNLRP I S PWRWLFS VWP VLI VSNGL 
KKKS L DHSG ALGGLWG F I LT IAN F S F FTS LLM FFLS S S KLTKW 
KGEVKKRLDSEYKEGGQRNWVQVFCNGAVPTELALLYMIENGPG 
E I PVD FS KQ YSAS WMCLSLLAALACS AGDTWASE VGPVLS KSS P 
RL I TTWE KVP VGTNGGVTVVGLVS SLLGGTFVG I AYFLTQLI FV 
NDLD I SAPQWP I IAFGGLAGLLGS I VDS YLGATMQYTGLDESTG 
MWNS PTNKARH LAGKP ILDNNAVNL FS S VL IALLL PTAAWGFW 
PRG 


6246 


1177 


359 


SLWPWILMDDSLMQISLQLLCVYTANFPNGCSSLCWSSCGQHPV 
QATHRGAVSNSLMLCILKIjASQMPLENTTVOXJMVFMIjLSNIALS 
HDC KG V 1 QKSNF LQNFLSLALPKGGNKHLSNLT I LWLKLLLN I S 
SGEDGQQMILRLDGCLDLLTEMSKYKHKSSPLLPLLIFHNVCFS 
P ANK P K 1 LANE KV I T VLAACLES ENQNAQRIGAAALWAL I YNYQ 
KAKTALKS PS VKRR VDEAYS LAKKTFPNS EANPLNAYYL KCLEN 
LVQLLNSS 


6247 


3 


1678 


"NSRVWGPWTEPSAGSLRPMARKQNRNSKELGLVPLTDDTSHAGP 
PGPGRALLECDHLRSGVPGGRRRKDWSCSLLVASLAGAFGSSFL 
YGYNLSWNAPTPYIKAFYNESWERRHGRPIDPDTLTLLWSVTV 
S 1 FAIGGL VGTL I VKM IGKVLGRKHTLLANNG FAISAALLMACS 
LQAGAFEMLIVGRFIMGIDGGVALSVLPMYLSEISPKEIRGSLG 
QVTAI FI CIGVFTGQLLGLPELLGKESTWPYLFGVI WPAWQL 
LS L P FLPD S PR YLLLE KHNEARAVKAEQTFLGKAHVSQE VE E VL 
AE S R VQRS I RL VS VLh LiLRAP i V KWy Wivivi MAL I y U^.\j LiNA 
IWFYTNS I FGKAGIPPAKI PYVTLSTGGIETLAAVFSGLVIEHL 
GRRPLLIGGFGLMGLFPGTLTITLTLQDHAPWVPYLSIVGILAI 
I AS F CS G PGG I P F I LTGE F FQQSQRP AAF 1 1 AGTVNWLSNFAVG 
LLFPFIQKSLDTYCFLVFATICITGAIYLYFVLPETKNRTYAEI 
SQAFSKRNKAYPPEEKIDSAVTDGKINGRP 


6248 


56" 


1773 


VPPPRMMAAVPPGLE PWNRVR I PKAGNRSAVTVQNPGAALDLCI 
AAVI KE CHLVILS LKSQTLDAETDVLCAVLYSNHNRMGRHKPHL 
ALKQVEQCLKRLKNMNLEGS IQDLFELFSSNENQPLTTKVCWP 
SQPWELVLMKVLGACKLLLRLLDCCCKTFLLTVKHLGLQEFII 
LNL VMVG L VS RLW VL Y KGVL KRL I LL YE P L FG LLQE VAR I Q PM P 
YFKDFTFPSDITEFLGQPYFEAFKKKMPIAFAAKGINKLLNKLF 
LINEQS PRAS EETLLG I S KKAKQMKINVQNNVDLGQ P VKNKRVF 
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W=Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








KEESSEFDVRAFCNQLKHKATQETSFDFKCSQSRLKTTKYSSQK 
VI GTPHAKS FVQRFREAES FTQLSEE IQMAWWCRS KKLKAQAI 
FLGNKLL KSNRL KHLEAQGTS L P KKLBC I KTS I CNHLLRGSG I K 
TSKHHLRQRRSQNKFLRRQRKPQRKLQSTLLREIQQFSQGTRKS 
ATDTSAKWRLS HCTVHRTDLY PNS KQLLNSG VSMP V I Q TKE KM I 
HENLRG I HENETDS WTVMQ I NKNSTSGT I KETDDI DD I FALMGV 


o AH. y 


cc 

3D 


1773 


VPPPRMMAiVPPnTiFPWTJRVPTPVZirJTvrR QAVTVONPHaar nT.PT 

V r cc RPinftft v tr C\jXJCj t riVit\ V A J. r AxUJiMAOnv i. V wi'tLxft>\J_iiJiJk^i. 

AAVIKECHLVILSLKSQTLDAETDVIjCAVLYSNHNRMGRHKPHL 
ALKQVEQCLKRLKNMNLEGSIQDLFELFSSNEKQPLTTKVCVVP 
S Q P WE LVLM KVLGAC KLLL RLLD CCC KTFLLTVKHLG LQ EF 1 1 
LNLVMVGLVSRLWVLYKGVLKRLILLYEPLFGLLQEVARIQPMP 
YFKDFTFPSDITEFLGQPYFEAFKKKMPIAFAAKGINKLLNKLF 
LINEQSPRASEETLLGISKKAKQMKINVQNNVDLGQPVXNKRVF 
KE ES S E FD VRAFCNQL KHKATQE TS FDFKCS QS RLKTTKY S SQ K 

VI GTPHAKS FVQRFREAES FTQLS EE IQMAWWCRS KKLKAQAI 

TS KHHLRQRRSQNKFLRRQRKPQRKLQSTLLRE IQQFSQGTRKS 
ATDTSAKWRLSHCTVHRTDLYPNSKQLLNSGVSMPVIQTKEKMI 
HENLRGIHENETDSWTVMQINKNSTSGT I KETDDI DDI FALMGV 


6250 


232 


1306 


LAAIjHIMALPFRKDLEKYKDLDEDELLXSNLSETELKQLETVLDD 
LD P ENALL P AG FRQ KNQTS KS TTG P FDREHLLS YLE KEALEHKD 
REDYVPYTGEKKGKIFIPKQKPVQTFTEEKVSLDPELEEALTSA 

eki lpvfde p pnptnvee s lkrt KENDAHLVE VNLNN IKNIPIP 
tlkdfakaletnthvkcfslaatrsndpvatafaemlkvnktlk 
S LNVESNF I tgvgilal idalrdnetlaelki dnqrqqlgtave 
lemakmleentnilkfgyqftqqgprtraanaitknndlvrkrr 
vegdhq 


6251 


62 


972 


tpgsgp^awaaaslsraaarcllargpgvraapprdprpshpe 
prgcgaapgrtlhftaavpaghnkwskvrhi kgpkdversri fs 

KLCLNI RLAVKEGGPNP EHNSNLAN I LE VCRS KHMPKST I ETAL 
KMEKSKDTYLLYEGRGPGGSSLLIEALSNSSHKCQADIRHILNK 
NGGVMAVGARHSFDICKGVIVVEVEDREKKAVNLERALEMAIEAG 

aedvketedeeernvfkficdasslhqvrkkldslglcsvscal 

EFI PNS KVQLAE PDLEQ AAHLI Q ALS NHEDV 1 HVYDNI E 


6252 


27 


1897 


eefctwiavrvgemetapkpgkdvppkkdklqtkrkkprrywee 
etvp ttagas pg p prnkknr elr pqrpknayi l kjcs ri s kkpq v 
pkkprewknpesqrglsgaqdpfpgpapvpvewqkfcridksr 
klphs kaktrs rle vaeaeeeets i kaarselllaeepg flege 

DGEDTAKICQADIVEAVDIASAAKHFDI^RQFGPYRLNYSRTG 

RHLiAFGGRRGlTVAALDWVTKKLMCEIlWMEAV^ 

AVAQNRWLH I YDNQGI ELHC IRRCDRVTRLEFLP FHFLLATAS E 

TGFLTYLDVSVGKIVAALNARAGRLDVMSQNPYNAVIHLGHSNG 

TVSLWSPAMKEPLAKILCHRGGVRAVAVDSTGTYMATSGLDHQL 

KIFDLRGTYQPLSTRTLPHGAGHLAFSQRGLLVAGMGDWNIWA 

GQGKAS P P S LE Q P YLTHRLSGP VHGLQ FCP FEDVLG VGHTGG 1 T 

smlvpgagepnfdglesnpyrsrkqrqewevkallekvpaelic 
ldpralaevdvisleqgkkeqierlgydpqakapfqpkpkqkgr 
sstaslvkpjcrkvmdeehrdkvrqslqqqhhkeakakptgarps 
aldrfvr 


6253 


27 


1897 


eefctwiavrvgemetapkpgkdvppkkdklqtkrkkprrywee 
etvpttagaspgpprnkknrelrpqrpknayilkksriskkpqv 
pkkprewknpesqrglsgaqdpfpgpapvpvewqkfcridksr 
klphs kaktrsrle vaeaeeeets i kaars elllaee pgflege 
dgedtaki cqad i veavd iasaakhfdlnlrqfgp yrlnysrtg 

RHIAFGGRRGHVAAUDWVTKKLMCEINVMEAVRDIRFLHSEALL 
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(A*Alanine, C=Cysteine, D=Aspartic Acid, E= 
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H=Histidine, I*Isoleucine, K*Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y-Tyroeine, X-Unknovm, *»Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








AVAQNRWLH I YDNQG I ELHCIRRCDRVTRLE FLPFHFLLATAS E 
TGFLTYLDVSVGKIVAALNARAGRLDVMSQNPYNAVIHLGHSNG 
TV^LWS PAMKEPLAKI LCHRGGVRAVAVDSTGTYMATSGLDHQL 
KI FDLRGTYQ PLS TRTLPHGAGHLAFSQRGLLVAGMGDWNI WA 
GQG KAS P P S LEQ P YLTHRLSGP VHGLQ FC P F ED VLG VGHTGG I T 
SMLVPGAGE PNFDGLESNPYRS RKQRQEWEVKALLEKVPAEL I C 
LDPRALAEVDVISLEQGKKEQIERLGYDPQAKAPFQPKPKQKGR 
SSTASLVKRKRKVMDEEHRDKVRQSLQQQHHKEAKAKPTGARPS 
ALDRFVR 


6254 


155 


1139 


HALGRRGGS QELSAAACGCFALRLRAPGSGRPALAPGAAAFAGL 
GGAPRFPPRGSAAGRTMLLKBYRICMPLTVDEYKIGQLYMISKH 
SHEQSDRGEGVEWQNEPFEDPHHGNGQFTEKRVYLNSKLPSWA 
RAWPKI FYVTEKAWNYYPYTITEYTCS FLPKFS IHIETKYEDN 
KGSNDTIFDNEAKDVEREVCFIDIACDEIPERYYKESEDPKHFK 
SEKTGRGQLREGWRDSHQPIMCSYKIiVTVKFEVWGLQTRVEQFV 
H KWRD ILL I GHRO AFAWVDE W YDMTMDD VR F YF KtJNTHFOTMT Y 
VCNQHSSPVDDIESHAQTST 


6255 


1 


1444 


PTRPQQELLVS LATVI FVASQ KALS VESKAVI KQQLES VSNGWT 
VYRIARQASRMGNHDMAKELYQSLLTQVASKHFYFWLNSLKEFS 
HAEQCLTGLQEENYSSALSCIAESLKFYHKGIASLTAASTPLNP 
LS FQCE FVKLR I DLLQAFS QL I CTCNS LKTS P P P AI ATT I AMTL 
GNDLQRCGRISNQMKQSMEEFRSLASRYGDLYQASFDADSATLR 
NVELQQQSCLLISHAIEALILDPESASFQEYGSTGTAHADSEYE 
R RMMS VYNHVLEE VES LNG KYT P VS YMHTACLCNA 1 1 ALLKVPL 
S FQRY F FQKLQSTS I KLALS PS PRNPAEPIAVQNNQQLALKVEG 
WQHGSKPGLFRKIQSVCLNVSSTLQSKSGQDYKIPIDNMTNEM 
EQRVE PHNDYFSTQFLLNFAILGTHN I TVES S VKDANG I VMKTG 
PRTT I F VKSLE DP Y<300 T R LOOOOAOOPI /V5HOnPMA YTP F 


6256 


1 


1542 


CRGAGAEPAANPRSPRSLVPSLESTSTSVPPAPGTMATDSWALA 
TOEQEAAAESLSNLHLKEEKIKPDTNGAVVKTNANAEKTDEEEK 
EDRAAQ SLLNKL I RSNL VDNTNQVE VLQRDPNS P L YS VKS FEEL 
RLKPQLLQG" VYAMGFNRPS KI QENALPLMLAE P PQNLIAQSQSG 
TGKTAAFVLAMLSQVEPANKYPQCLCLSPTYELAIjQTGKVIEQM 
GKFYPELKLAYAVRGNKLERGQKISEQIVIGTPGTVLDWCSKLK 
F ID P KK I KVFVLDEADVM I ATQGHQDQS I R I QRMLPRNCQMLLF 
SATFEDSVWKFAQKWPDPNVIKLKREEETLDTIKQYYVLCSSR 
DE KFQALCNLYGAI TI AQAM I FCHTR KTAS WLAAELS KEGHQ VA 
LLSGEMMVEQRAAVIERFREGKEKVLVTTNVCARG IDVEQVSW 
INFDLPVDKDGNPDNETYLHRIGRTGRFGKRGLAVNMVDSKHSM 
NILNRIQEHFNKKIERLDTDDLDEIEKIAN 


6257 


210 


615 


AF I PAMAELI QKKLQGE VEKYQQLQKDLS KSMSGRQKLEAQLTE 
NNIVKEELALLIXSSNWFKLLGPVLVKQEIjGEARATVGKRLDYI 
TAEI KRYESQLRDLERQSEQQRETLAQLQQEFQRAQAAKAGAPG 
KA 


6258 


210 


615 


AFIPAMAELIQKKXQGEVEKYQQLQKDLSKSMSGRQKLEAQLTE 
NNIVKEELALLDGSNWFKLIXSPVLVKQEI^EARATVGKRLDYI 

taeikryesqlrdlerqseoqretlaqlqqefqraqaakagapg 

KA 


6259 


2 


1540 


ILEKGFPSQCHPERKWKVDDVLESSQENEDDHFWELLFHNNKTV 
SVENGDRGSKTFNLGTDPVSLRNYPYKICDSCEMNLKNISGLII 
SKKNCSRKKPDEFNVCEKLLLDIRHEKIPIGEKSYKYDQKRNAI 
NYHQDLSQPSFGOSFEYSKNGQGFHDEAAFFTNKRSQIGETVCK 
YNECGRTF I ESLKLNI SQRPHLEMEP YGCS ICGKSFCMNLRFGH 
QRALTKDNPYEYNEYGEIFCDNSAFIIHQGAYTRKILREYKVSD 
KTWEKSALLKHQIVHMGGKSYDYNENGSNFSKKSHLTQLRRAHT 
GEKTFECGECGKTFWEKSNLTQHQRTHTGEKPYECTECGKAFCQ 
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SEQ 

jn 
xu 

NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

bo first 

ami rri ar^iri 

residue of 
amino acid 
sequence 


Predicted end 

location 
corresponding 
to first 
amino acid 
yp^idue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

/ r — a 1 ani P=:r*vci h &i n» H-Rcnarhir Acid R— 
Glutamic Acid, F=Phenylalanine , G«Glycine, 
HsHistidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine , N«Asparagine , 
PsProline, Q»Glutamine, R=Arginine, 
S—Serine, T=Threonine , V= Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KPHLTNHQRTHTGE KPYECKQCGKTFCV KSNLTEHQRTHTGE KP 
YECNACGKS FCHRSALTVHQRTHTGEKPF I CNECGKS FCVKSNL 
IVHQRTHTGEKPYKCNECGKTFCEKSALTKHQRTHTGEKPYECN 
ACGKTFSQRS VLTKHQR IHTRVKALSTS 


6260 


2081 


1436 


GTGPEIHACAHASARAPGSRAMALRELKVC^LGDTGVGKSS I VW 

RALAPMYYRGSAAAI I VYDITKEETFSTLKNWVKELRQHGPPNI 
WAIAGNKCDLI DVRE VMERDAKDYADS IHAI FVETSAKNAINI 
NELFIEISRRIPSTDANLPSGGKGFKLRRQPSEPKRSCC 


6261 


3 


1188 


FWYRLGPGTRSRWPRRGSWAASLVPRGPSPAALVTSPCPPDPLR 
SPACE PCRPDFAPRPAUjLRSGPRSAPAVTGKPALKGQPGPWPG 
MAEVS IDQSKLPGVKEVCRDFAVLEDHTLAHSLQEQE I EHHLAS 
NVQRNRLVQHDLQVAKQLQEEDLKAQAQLQKRYKDLEQQDCEIA 
QEIQEKLAIEAERRRIQEKKDEDIARLLQEKELQEEKKRKKHFP 
EFPATRAYADSYYYEDGGMKPRVMKEAVSTPSRMAHRDQEWYDA 
EIARKLQEEELLATQVDMRAAQVAQDEEIARLLMAEEKKAYKKA 
KEREKSSLDKRKQDPEWKPKTAKAANSKSKESDEPHHSKNERPA 
RP P P P I MTDGEDADYTHFTNQQS STRHFSKS ES SHKGFHYKH 


6262 


2 


1759 


PECHSQGLCSVHRPGKVPQARMSGLVLGQRDE PAGHRLSQEEI L 
GSTRLVSQGLEALRS EHQAVIiQSIiSQTI ECLQQGGHEEGLVHEK 
ARQLRRSMBNIELGLSEAQVMIJVLASHLSTVESEKQKLRAQVRR 
LCQENQWLRDELAGTQQRLQRSEQAVAQLEBEKKHLEFLGQLRQ 
YDEDGHTSEEKEGDATKDSLDDLFPNEEEEPPSNGLSRGQGATA 
AQQGG YE I PARLRTLHNL V I Q YAAQGR YEVAVP LCKQAIjEDLER 
TSGRGHPDVATMLNILALVYRLQNKYKEAAHLLNDALSIRESTL 

gpdhpavaatlnnlavlygkrgkykeaeplcqraleirekvlgt 
npipdvakql^lallcqnogkyeaveryyqralaiyegqlgpdn 
pnvartknnlas c yl kqg k yaeae tlykei ltrahvqe fgsvdd 
dhkpiwmhaeereemsksrhheggtpyaeyggwykackvssptv 

NTTLRNLGALYRRQGKLEAAETLEECALRSRRQGTDPISQTKVA 
ELLGESDGRRTSQEGPGDSVKFEGGEDASVAVEWSGDGSGTLQR 
SGSLGKIRDVLRR 


6263 


1 


240B 


RELDSLADLPERIKPPYANGLSTSHLRSSSVEDVKLIISEGRPT 
IEVRRCSMPSVICEHTKQFQTISEESNQGSLLTVPGDTSPSPKP 
EV FSNVPERDL SNVSN I HS S FATS P TG ASNS K YVS ADRNL I KNT 
AP VNTVMDS P VHLE P S SQ VG VI QNKS WE MP VDRLETLSTRD F I C 
PNSN I PDQES ShQS FCN S ENKVLKENAD FLSLRQTELPGNS CAQ 
DPASFMPPQQPCSFPSQSLSDAESISKHMSLSYVANQEPGILQQ 

IQEASPNFEKAYTLPVLPSEKDFNGSDASTQLNTHYAFSKLTYK 
S S SGHEVENS TTDTQ V I S HE KENKLES LVLTHLSRCDSDLCE MN 
AGMPKGNLNEQDPKHCPESEKCLIjSIEDEESQQSILSSLENHSQ 
QS T Q P EMHK YGQL VKVELE ENAEDD KTENQ I PQRMTRNKANTMA 
NQSKQ I LASCTLLS EKDS ES S S PRGRIRLTEDDDPQI HHPRKRK 
VSRVPQPVQVSPSLLQAKEKTQQSLAAIVDSLKLDEIQPYSSER 
ANPYFEYLH IRKKI EEKRKLLCSVI PQAPQ YYDEYVTFNGS YLL 
DGNPLSKICIPTITPPPSLSDPLKELFRQQEWRMKLRLQHSIE 
REKLIVSNEQEVLRVHYRAARTLANQTLPFSACTVLLDAEVYNV 
PLDSQSDDSKTSVRDRFNARQFMSWLQDVDDKFDKLKTCLLMRQ 
QHEAAALNAVQRLE WQLKLQELDPAT YKS I S I YE IQE FYVPLVD 
VNDDFELTPI 


6264 


143 


1960 


KHRQE2WALDMAPEIHMTGPMCLIENTNGELVANPEALKILSAI 
TQPVWVAIVGLYRTGKSYLMNKLAGKNKGFSLGSTVKSHTKGI 
WMWCV PH PKKP EHTL VLLDTEG LGD VKKGDNQNDS W I FTLAVLL 
SSTLVYNSMGTINQOAMDQLYYVTELTHRIRSKSSPDENENEDS 
: ADFVS FFPDFVWTLRDFSLDLEADGQPLTPDEYLEYSLKLTQGT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, OCysteine, D-Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
PsProline, Q=Glutamine, R=Arginine, 
S -Serine, T»Threonine, V«Valine, 
W-Tryptophan, Y*Tyroaine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SQKDKNFNLPRLCIRKFFPKKKC^VFDLPIHRRKIAQLEKLQDE 
E LD PE F VQQ VAD PCS Y I FSNS KTKTLS GG I KVNG P RLES LVLTY 
I NAI S RG D L P CM EN AVLALAQ I EN S AAVQ KA I AHYDQQMGQ KVQ 
LPAETLQELLDLHRVSEREATEVYMKNS FKDVDHLFQKKLAAQL 
DKKRDDFCKQNQEASSDRCS ALLQVI FS PLEEEVKAG I YSKPGG 
YCLFIQKLQDLEKKYYEEPRKGIQAEEILQTYLKSKESVTDAIL 
QTDQ I LTE KE KE I E VE CVKAES AQAS AKMVE EMQ I KYQQMMEE K 
EKSYQEHVXQLTEKMERERAQLLEEQEKTLTSKLQEQARVLKER 
CQGESTQLQNEIQKLQKTLKKKTKRYMSHKLKI 


6265 


143 


1960 


KHRQENNALDMAPE I HMTG PMCL I ENTNGELVANPEALKILSAI 
TQPVVWAIVGLYRTGKSYLMNKLAGKNKGFSLGSTVKSHTKGI 
WMWCVPHPKKPEHTLVLLDTEGLGDVKKGDNQNDSW I FTLAVLL 
S S TLVYNS MGT INQQAMDQL YYVTELTHR I RS KS S PDENENEDS 
ADFVSFFPDFVWTLRDFSLDLEADGQPLTPDEYLEYSLKLTQGT 
SQKDKNFNLPRLCIRKFFPKKKCFVFDLPIHRRKLAQLEKLQDE 
ELDPEFVQQVADFCSYIFSNSKTKTLSGGIKVNGPRLESLVLTY 
INAI SRGDLPCMENAVLALAQI ENS AAVQKAIAHYDQQMGQKVQ 
LPAETLQELLDLHRVSEREATEVYMKNS FKDVDHLFQKKLAAQL 
DKKRDDFCKQNQEASSDRCSALLQVI FS PLEEEVKAGI YSKPGG 
YCLFI QKLQDLEKKYYEEPRKG I QAEE ILQTYLKSKESVTDAI L 
QTDQ I LTE KEKEIEVE CVKAES AQAS AKMVEEMQ I KYQQMMEEK 
E KS YQEHVKQLTE KMERE RAQLLEEQE KTLTS KLQEQAR VLKE R 
CQGESTQLQNEIQKLQKTLKKKTKRYMSHKLKI 


6266 


276 


1421 


GSHQKQMLVPCFLYSLQNRKPSLYGSLTCQGIGLDGIPEVTASE 
GFTVNEINKKS IHI S CPKENASS KFLAP YTTFSRIHTKS ITCLD 
I S S RGGLG VS S S TDGTMKI WQAS NGELRRVLEGHVFD VNCCR F F 
PSGLWLSGGMDAQLKIWSAEDASCWTFKGHKGGILDTAIVDR 
GRNWSAS RDGTARLWDCGRS ACLG VLADCGS S INGVAVGAADN 
SINLGSPEQMPSEREVGTEAKMLLLAREDKKLQCLGLQSRQLVF 
LFIGSDAFNCCTFLSGFLLLAGTQDGNIYQLDVRSPRAPVQVIH 
RSGAPVLSLLSVRDGFIASQGDGSCFIVQQDLDYVTELTGADCD 
P VYKVATWEKQ I YTCCRDGLVRRYQLSDL 


6267 


3 


622 


LGMMKKNNS AKRGPQDGNQQPAPPEKVGWVRKFCGKG I FRE IWK 
NRYWLKGDQL Y I S E KEVKDEKN I QE V FDLS DYEKCE ELRKS KS 
RS KKNHS KFTLAHS KQ PGNTAPNL I FLAVS PEEKES W INALNS A 
I TRAKNR I LDE VTVE EDS YLAHPTRDRAKI QHS RRPPTRGHLMA 
VAS TS TSDGMLTLDL I QEEDPS PEEPTSLC 


6268 


160 


1368 


HRELCQNL PAGLS SAL I DNPLTLLLS I DT YVMLQEP VTFQD VAV 
DFSREEWGLLGPTQRTEYRDVMLETFGHLVSVGWETTLENKELA 
PNS D I PEE EPAP S L KVQES SRDCALS STLEDTLQGGVQE VQDTV 
LKQMESAQEKDLPQKKHFDNRESQANSGALDTNQVSLQKIDNPE 
SQANSGALDTNQVLLHKI P PRKRLRKRDSQVKSMKHNSRVKIHQ 
KSCERQKAKEGNGCRKTFSRSTKQITFIRIHKGSQVCRCSECGK 

GERPFECQECGRTFNDRSAISQHLRTHTGAKPYKCQDCGKAFRQ 
SSHLIRHQRTHTGERPYACNKCGKAFTQSSHLIGHQRTHNRTKR 
KKKQPTS 


6269 


2886 


1449 


HASAPTRRNMAAASPLRDCHAWKDARLPLSTTSNEACKLFDATL 
TQ YVKWTNDKS LGG I EGCLS KLKAADPTFVMGHAMATGLVL I GT 
GSS VKLDKELDLAVKTMVE I SRTQPLTRREQLHVSAVETFANGN 
FPKACELWEQILQDHPTDMLALKFSHDAYFYLGYQEQMRDSVAR 
IYPFWTPDIPLSSYVKGIYSrci^ETNFYDQAEKLAKEALSINP 
TDAWS VHTVAH IHEMKAE I KDGLEFMQHSETLWKDSDMLACHNY 
WHWALYLIEKGEYEAALTIYDTHILPSLQANDAMLDWDSCSML 
YRLQMEG VS VGQRWQDVLP VARKHSRDHI LLFNDAHFLMAS LGA 
HDPQTTQELLTTLRDASESPGENCQHLLARDVGLPLCQALVEAE 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=»Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine , JO=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V*= Valine, 
W=Tryptophan, Y^Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








DGNPDRVLELLLP I RYRI VQLGGSNAQRDVFNQLLIHAALNCTS 
SVHKNVARSLJjMERDALKPNSPLTERLIRKAATVHLMQ 


6270 


23 


2086 


S VTVT LG S EGDGR P P T YHLEEME Q E PQNG E P AE I KI I REAYKKA 
FLFVNKGLNTDELGQKEEAKNYYKQGIGHLLRGISISSKESEHT 
GPGWES ARQMQQKMKETLQNVRTRLE I LE KGLATS LQNDLQEVP 

VT VDTTPDTJ VTMUTPP VT 0^!GT7C C "h tDAUHTVWfSNFT'CTDC 

KiiZ trCtC tr r rUJML.lSrtJji' EiC\iOt DOAryiliUi VJNIjIN J.D 1 r SttLxi-\.Y J\t\ 

PAS LSLPSQS CPAEAP PAYTPQAAEGH YTVS YGTDSGE FSSVGE 
EFYRNHSQPPPLETLGLDADEIjI LI PNGVQ I FFVNPAGEVSAPS 
YPGYLRIVRFLDNSLDTVLNRPPGFLQVCDWLYPLVPDRSPVLK 
CTAGAYMFPDTMLQAAGCFVGWLSSEIiPEDDRELFEDLLRQMS 
DLRLQ ANWNRAEE ENE FQ I PGRTR P SS DQLKE ASGTD VKQLDQG 
NKDVRHKGKRGKRAKDTSSEEVNLSHIVPCEPVPEEKPKELPEW 
SEKVAHNI LSGAS W VS WGLVKGAE I TGKAI QKGAS KLRERIQPE 

EKrVbi PAV 1 KoLi I lAAUAIooAArWoyr JjVLAj 1 V*\1X\~ VuR 

ELAPHVKKHGSKLVPESLKKDKDGKSPLDGAMWAASSVQGFST 
VWQGLECAAKCIVNNVSAETVQTVRYKYGYNAGEATHHAVDSAV 
NVGVTAYN INN IG I KAM VKKTATQTGHTLLEDYQ I VDNSQRENQ 
EGAANVNVRGEKDEQTKEVKEAKKKDK 


6271 


32 


1058 


trC^VKlAUiyivQjKnilvbijolHr V t'Ljo\^Kij ViSIjIj ViN JLirlNrtK VIjV i\3f\ 

TGLLGRAVHKEFQQNNWHAVGCGFRRARPKFEQVNLLDSNAVHH 
1 1 HD FQPHVI VHCAAE RRPD WENQPDAASQLNVDAS GNLAKEA 
AAVGAFLIYISSDYVFDGTNPPYREEDIPAPLNLYGKTKLDGEK 
AVLENNLGAAVLR I P I L YGEVE KL E E S AVTVM FDKVQ F SNKSAN 

TKYEMACAIADAFNLPS SHLRP ITDS P VLGAQR PRNAQLDCSKL 
ETLGIGQRTPFRIGIKESLWPFLIDKRWRQTVFH 


6272 


1136 


528 


gavmedaaapgrtegvlerqgappaagqggalveltptpgglal 
vs p yhthragd pldlvalaeqvq kad e f i ranatn/klt vi aeq i 
qhlqeqarkvledahrdanlhhvacnivkkpgniyylykresgq 
qyfs i ispkewgtscphdflgayklqhdlswtp yediekqdaki 
smmdtllsqsvalppctepnfqglth 


6273 


256 


843 


SCPRVSPECRSLGCQVMFSLPLNCSPDHIRRGSCWGRPQDLKIA 

HVLIEDHRIVFSCKNADGVELYNEIEFYAKVNSKDSQDKRSSRS 
I T C FVRKWKE KVAW PRLTKED I KP VWLS VDFDNWRD WEGDEE ME 
LAHVEHYAEVRDNTYCVLPT 


6274 


56 


1142 


AAAAMAAAAGGG AGAARS LS R FRG CLAG ALLGDCVGS F YEAHDT 
VDLTS VLRHVQS LE P DPGTPGS ERTEAL Y*YTDDTAMARALVQS L 
LAKEAFDEVEMAHRFAQEYKKDPDRGYGAGVVTVFKKLLNPKCR 
DVFE PARAQ FNG KGS YGNGGAMRVAG I S LAYS S VQDVQKFARLS 
AQLTHASSIX^YNGAILCA^VHLALQGESSSKHFLKQLLGHMED 
LEGDAQSVLDARELGMEERPYSSRLKKIGELLDQASVTREEWS 
ELGNGIAAFESVPTAIYCFLRCMEPDPEIPSAFNSLQRTLIYSI 
SLGGDTDTIATMAGAIAGAYYGMDQVPESWQQSCEGYEETDILA 
QSLHRVFQKS 


6275 


20 


565 


SRRGRARCLARGSRRPVFRPAXTMAFMVKTMVGGQLKNLTGSLG 
GGEDKGDGDKSAAEAQGMSREEYEEYQKQLVEEKMERDAQFTQR 
KAERATLRS H FRD K YRLPKNE TDE S Q I QMAGGD VE L PRELAKM I 
EEDTEEEEEKASVLGOLA^LPGI^GSLKDKAQATLGDLKQSAE 
KCHVM 


6276 


797 


97 


TLLPLPPLPDTEGMILLNTGLEGTVAENPVPIVHTPSGNILTLE 
S CLQQLATHPGHWGIHLQIAEPAALRP S LALLARLS SLGLLHWP 
VWVGAKISHGSFSVPGHVAGRELLTAVAEVFPHVTVAPGWPEEV 
LGSGYREQLLTDMLELCQGLWQ PVS FQMQAMLLGHS TAGAI GRL 
LASSPRATVTVEHNPAGGDYASVRTALLAARAVDRTRVYYRLPQ 
GYHKDLLAHVGRN 
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Amino acid segment containing signal peptide 
{A=Alanine, C-Cysteine, D^Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H»Histidine, I=Isoleucine, K=Lysine, 
L»= Leucine, M=Methionine, N=Asparagine, 
Ps=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T*Threonine, Vs Valine, 
WsTryptophan, Y»Tyrosine, X«Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


£277 


4600 


2744 


MAFRTEMGLYYSYFKTIVEAPSFLNGVWMIMNDKLTEYPLVINT 
LKRFNLYPBVILASWYRIYTKIMDLIGIQTKICWTVTIGEGLSP 
TES CEGLGDPACFYVAVI F I LNGLMMALFF I YGT YLSGSRLGGL 
VTVLCFFFNHGECTRVMWTPPLRESFSYPFLVLQMLLVTHILRA 
TKLYRGSL I ALC ISNVFFMLP WQFAQFVLLTQ I ASLFAVYWGY 
IDICKLRKI IYIHMISLALCFVLMFGNSMLLTSYYASSLVI I WG 
ILAMKPHFLKINVSELSLWVIQGCFWLFGTVILKYLTSKIFGIA 
NDAH IGNLLTS KFFSYKD FDTLLYTCAAEFDFMEKETFLRYTKT 
LLLPWLVGFVAIVRKI I SDMWGVLAXQQTHVRKHQFDHGELVY 
HALQLLAYTALG I LIMRLKLFLTPHMCVMASLI CSRQLFGWLFC 
KVHPGA I VFAI LAAMS I QGS ANLQTQ WNT VGEFSNL PQEEL I E W 
I KYSTKPDAVFAGAMPTMAS VKLSALRP I VNHPHYEDAGLRART 
KIVYSMYSRKAAEEVKRELI1CLKVNYYILEESWCVRRSKPGCSM 
PEIWDVEDPANAGKTPLCNLLVKDSKPHFTTVFQNSVYKVLEVV 
KE 


6278 


3 


823 


ILFRLVLLSLVYLLNSVATEERKPAEVLIVEGQQYAWGTVLLL 
IRI I LE YCQGVDNI P S VTTDMLTRL S DLLKYFNS RS CQLVLGAG 
ALQWGLKTITTKNLALSSRCLQLIVHYIPVIRAHFEARLPPKQ 
YSMIjRHFDHITKDYHDHIAEISAKLVAIMDSLFDKLLSKYEVKA 
PVPSACFRNICKQMTKMHEAIFDLLPEEQTQMLFLRINASYKLH 
LKKQLSHLNVINDGGPQNGLVTADVAFYTGNLQALKGLKDLDLN 
MAEIWEQKR 


6279 


12 7 


1687 


GGAMASDGARKQFWKRSNSKLPGSIQHVYGAQHPPFDPLLHGTL 
LRSTAKMPTTPVKAKRVSTFQEFESNTSDAWDAGEDDDELLAMA 
AESLNSEWMETANRVLRNHSQRQGRPTLQEGPGLQQKPRPEAE 
PPSPPSGDLRLVKSVSESHTSCPAESASDAAPLQRSQSLPHSAT 
VTLGGTSDPSTLSSSALSEREASRLDKFKQLLAGPNTDLEELRR 
LS WSG I P KPVRPMTWKLLSGYLPANVDRRPATLQRKQKE YFAF I 
EHYYDSRNDEVHQDTYRQIHIDIPRMSPEALILQPKVTEIFERI 
LFI WAI RHPASG YVQG INDLVTPFFWFI CEYIEAEE VDTVDVS 
GVPAEVLCNIEADTYWCMSKLLDGIQDNYTFAQPGIQMKVKMLE 
ELVSR I DE Q VHRHLDQHE VR YLQFAFRWMNNLLMRE VP LRCT I R 
LWDTYQSEPDGFSHFHLYVCAAFLVRWRKEILEEKDFQELLLFL 
QNLPTAHWDDEDISLLIAEAYRLKFAFADAPNHYKK 


6280 


857 


2515 


ECCDQKMGSRNSSSAGSGSGDPSEGLPRRGAGLRRSEEEEEEDE 
DVDLAQVLAYLLRRGQVRLVQGGGAANLQF IQALLDSEE ENDRA 
WDGRLGDRYNPPVDATPDTRELEFNEIKTQVELATGQLGLRRAA 
QKHS FP RMLHQRERGLCHRGS FS LG EQS R V I S HFLPND LGFTDS 
YS QKAFCG I YS KDGQ 1 FMS ACQDQTI RLYDCR YGRFRKFKS IKA 
RDVGWSVLDVAFTPDGNHFLYSSWSDYIHI CNI YGEGDTHTALD 
LRPDERRFAVFSIAVSSDGREVLGGANDGCLYVFDREQNRRTLQ 
IESHEDDVNAVAFADISSQILFSGGDDAICKVWDRRTMREDDPK 
PVGALAGHQDG I TF I DS KGDARYLI SNSKDQTI KLWDIRRFSSR 
EGMEASRQAATQQEWDYRWQQVPKKAWRKLKLPGDSSLMTYRGH 
GVLHTLIRCRFSPIHSTGQQFIYSGCSTGKWVYDLLSGHrVKK 
LTNHKACVRDVSWHP FEEKI VSSS WDGNLRLWQ YRQAEYFQDDM 
PESEECASAPAPVPQSSTPFSSPQ 


6281 


857 


2515 


ECCDQKMGSRNSSSAGSGSGDPSEGLPRRGAGLRRSEEBEEEDE 
DVDIAQVUIYLLRRGQVRLVO^SGAANLQFIQALLDSEEENDRA 
WDGRLGDRYNPPVDATPDTRELEFNEIKTCjVEIATGQLGLRRAA 
QKHS FPRMLHQRERGLCHRGS FSLGEQSRVISHFLPNDLGFTDS 
YS QKAFCG I YS KDGQ I FMS ACQDQT I RLYDCR YGRFRKFKS I KA 
RDVGWSVLDVAFTPDGNHFLYSSWSDYIHICNIYGEGDTHTALD 
LRPDERRFAVFSIAVSSDGREVLGGANDGCLYVFDREQNRRTLQ 
IES HEDDVNAVAFAD IS SQ I LFSGGDDAI CKVWDRRTMREDDPK 
PVGALAGHQDG I TF IDS KGDARYLI SNSKDQTI KLWDIRRFSSR 
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EGMEASRQAATQQNWDYRWQQVPKKAWRKLKLPGDSSLMTYRGH 
GVLHTLIRCRFSPIHSTGQQFIYSGCSTGKVWYDLLSGHIVKK 
LTNHKACVRDVSWHPFEEKIVSSSWDGNLRLWQYRQAEYFQDDM 
PESEECASAPAPVPQSSTPFSSPQ 


6282 


125 


906 


RMAACRALKAVLVDLSGTLHIEDAAVPGAQEALKRLRGASVI IR 
FVTNTTKESKQDLLERLRKLEFDISEDEirTSLTAARSLLERKQ 
VR PMLLVDDRALPDFKG I QTSD PNA WMGLAP EHFH YQ I LNQAF 
RLLLDGAPL I AI HKAR YYKRKDGLALGPG P F VTALE YAT D TKAT 
WGKPEKTFFLEALRGTGCEPEEAVMIGDDCRDDVGGAQDVGML 
GILVKTGKYRASDEEKINPPPYLTCESFPHAVDHILQHLL 


6283 


140 


1043 


LSLFG IHVMNP FWSMS TSS VRKRS EGEEKTLTGD VKTS P PRTAP 
KKQL P S I P KNAL P ITK PTS P APAAQ S TNGTHAS YGP F YLE YS LL 
AE FTL WKQ KL P G VYVQ PS YRS ALMWFGV I F I RHGL YQDGVFKF 
TVYI PDNYPDGDCPRLVFDI PVFHPLVDPTSGELDVKRAFAKWR 
RNHNH I WQVLM YARRVFYK I DTAS P LNPEAAVL YE KDIQLFKSK 
WDSVKVCTARLFDQPKIEDPYAISFSPWNPSVHDEAREKMLTQ 
KKKPEEQHNKS VHVAGLS W VKPGS VQP FSKEE KTVAT 


6284 


1 


2879 


RSVIPGSTISSRWPGLSRPRFMAAHEWDWFQREELIGQISDIRV 
QNLQVERENVQKRTFTRWINLHLEKCNPPLEVKDLFVDIQDGKI 
LMALLEVI^GRNLI^EYKSSSHRIFRLNNIAKALKFLEDSNVKL 
VSIDAAEIADGNPSLVLGLIWNIILFFQIKELTGNLSRNSPSSS 
LAPGSGGTDSDS S FPPTPTAERS VA I S VKDQRKAI KALLAWVQR 
KTRKYGVAVQD FAGS WRS G LAFLAV I KAI DP S L VDMKQALENST 
RENLEKAFSIAQDALHI PRLLEPEDIMVDTPDEQS IMTYVAQFL 
ERFPELEAEDI FDSDKEVP I ESTFVR I KETPSEQES KVFVLTEN 
GERTYTVNHETSHPPPSKVFVCDKPESMKEFRIiDGVSSHALSDS 
STEFMHQIIDQVLQGGPGKTSDISEPSPESSILSSRKENGRSNS 
LP I KKTVHFEADTYKDP FCSKNLSLCFEGSPRVAKESLRQDGHV 
LAVEVAEEKEQKQESSKIPESSSDKVAGDIFLVEGTNNNSQSSS 
CNGALESTARHDEESPiSLSPPGENTVMADSFQIKVNLMTVEALE 
EGDYFEAI PLKASKFNSDLIDFASTSQAFNKVPSPHETKPDEDA 
EAFENHAEKLGKRS IKSAHKKKDSPEPQVKMDKHEPHQDSGEEA 
EGCPSAPEETPVDKKPEVHEKAKRKSTRPHYEEEGEDDDLQGVG 
EELSSSPPSSCVSLETLGSHSEEGLDFKPSPPLSKVSVIPHDLF 
YFPHYEVPLAAVLEAYVEDPEDLKNEEMDLEEPEGYMPDLDSRE 
EEADGSQSSSSSSVPGESLPSASDQVLYLSRGGVGTTPASEPAP 
liAPHEDHQQRETKENDPMDSHQSQESPNLENIANPLEENVTKES 
I SSKKKEKRKHVDHVESSLFVAPGSVQSSDDLEEDSSDYS I PSR 
TSHSDSS I YLRRHTHRSSESDHFSLCSVEERSRSG 


6285 


2157 


1331 


S CKTENLLEMWWFOQGLS FLPSALVI WTSAAFI FS Y I TAVTLHH 
IDPALPYI SDTGTVAPEKCLFGAMLNIAAVLCIATIYVRYKQVH 
ALS PEENVI I KLNKAGLVLGI LS CLGLS I VANFQ KTTL FAAHVS 
GAVLT FGMG S LYMF VQT I LS YQMQPK I HGKQ VFW I RLLL V I WCG 
VSAIjSMLTCSSVLHSGNFGTDLEQKLHWNPEDKGYVLHMITTAA 
EWSMSFSFFGFFLTYIRDFQKISLRVEANLHGLTLYDTAPCPIN 
NERTRLLSRDI 


6286 


1619 


276 


KAG AS CCG S ANP YVS VGKS CVLLAMAQLQTRF YTDN KK YAVDD V 
PFS IPAASE IADLSNI INKLLKDKNEFHKHVEFDFLIKGQFLRM 
PLDKHMEMENISSEEWEIEYVEKYTAPQPEQCMFHDDWISSIK 
GAEEW I LTGS YDKTSRI WSLEGKS IMTI VGHTDWKDVAWVKKD 
S LS CLLLSAS MDQT I LLWEWNVERNKVKALHCCRGHAGS VDS I A 
VDGSGTKFCSGSWDKMLKIWSTVPTDEEDEMEESTNRPRKKQKT 
EQLGLTRTPIVTLSGHMEAVSSVLWSDAEEICSASWDRTIRVWD 
VESGSLKSTLTGNKVFNCISYSPLCKRLASGSTDRHIRLWDPRT 
KDGSLVS LSLTSHTGWVTS VKWSPTHEQQLI S GSLDN I VKLWDT 
RS CKAPL YDLAAHED KVLS VDWTDTGIJjLSGG ADNKL YS YR YS P 



480 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C-Cysteine, D*Aspartic Acid, 
Glutamic Acid, F=Phenylalanine f G^Glycine, 
H=Histidine, 2>Isoleucine, K=Lysine, 
L= Leu cine , M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T=Threonine, V-Valine, 
W*Tryptophan, Y-Tyrosine, X- Unknown, *-Stop 
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TTSHVGA 


6287 


27-8 


1482 


MQFFFNFQ IGLRSTSG KE KYSGDAGFLGDALQLFLQCLALDEDF 
APAKLQVQKILCDLLLPENLKEGLKESSWSSLPCTKNRPFDFHS 
VMEESQSLNEPSPKQSEEIPEVTSEPVKGSLNRAQSAQSINSTE 
M P ARE D CL KRVS S E P VLS VQE KG VLL KRKLSLLEQDV I VNE DGR 
NKLKKQGETPNEVCMFSLAYGDIPEELIDVSDFECSLCMRLFFE 
PVTTPCGHSFCKNCLERCLDHAPYCPLCKESLKEYLADRRYCVT 
QLLEEL I VKYLPDELS E R K KI YDEETAELSHLTKNVPI FVCTMA 
YPTVPCPLHVFEPRYRLMIRRSIQTGTKQFGMCVSDTQNSFADY 
GCMLQIRNVHFLPDGRSVVDTVGGKRFRVLKRGMKDGYCTADIE 
YLEDV 


6288 


1 


743 


VTLYPCRGLVGNLLLGASGMASGCKI GPS I LNSDLANLGAECLR 

MLDSGADYLHLDVMDGHFVPNITFGHPWESLRKQLGQDPFFDM 

HMMVSKPEQWVKPMAVAGANQYTFHLEATENPGALIKDIRENGM • 

KVGIaAIKPGTSVEYUAPWANQIDMALVMTVEPGFGGQKFMEDMM 

PKVHWIJiTQFPSLDIEVIXSGVGPDTVHKCAEAGANMIVSGSAIM 

RSEDPRSVINLLRNVCSEAAQKRSLDR 


6289 


1 


743 


VTLYPCRGLVGNLLLGASGMASGCKIGPSILNSDLANLGAECLR 
MLDSGADYLHLDVMDGHFVPNITFGHPWESLRKQLGQDPFFDM 
HMMVS KPEQ WVKPMAVAGANQ YT FHLEATENPGAL IKDI RENGM 
KVGLAI KPGTS VEYLAPWANQ I DMAL VMTVE PG FGGQKFMEDMM 
PKVHWLRTQFPSLDIEVDGGVGPDTVHKCAEAGANMIVSGSAIM 
RSEDPRSVINLLRNVCSEAAQKRSLDR 


6290 


3 


1856 


TLGRWLLGVYETVAPTLACLPRPRLRRRRRRRRRRMISRYTRKA 
VPQSLELKGITKHALNHHPPPEKLEEISPTSDSHEKDTSSQSKS 
DI TRES S FTS ADTGNS LS AFPS YTGAGISTEGS SDFS WG YGELD 
QNATE KVQTMFTAIDELL YEQKLS VHTKSLQEE CCX3WTAS F PHL 
RILGRQIITPSEGYRLYPRSPSAVSASYETTLSQERDSTIFGIR 
GKKLHFSSS YAHKASS IAKSSSFCSMERDEEDS 1 1 VSEGI I EEY 
LAFDHIDIEEGFHGKKSEAATEKQKLGYPPIAPFYCMKEDVLAY 
VFDSVWCKWS CMEQLTRSHWEGFASDDESNVAVTRPDSESS CV 
LSELHPLVLPRVPQSKVLYITSNPMSLCQASRHQPNVNDLLVHG 
WPLQPRNLSLMDKLLDLDDKLLMRPGSSTILSTRNWPNRAVEFS 
TS S LS YT VQS TRRRNP P PRTLHP I STS HS CAET P RS VE E I LRGA 
RVP VAP D S LS S P S P TPLS RNNLL P P I GTAEVEHVS T VG P QRQM K 
PHGDS SRAQSAWDEPNYQQ PQERLLLPDFFPRPNTTQS FLLDT 
QYRRS CAVEY PHQARPGRG SAGPQLHGSTKS Q S GGR P VS RTRQG 
P 


6291 


1732 


602 


LVAKMAS SASARTPAGKRVINQEELRRLMKE KQRLSTSRKR I ES 
PFAKYNRLGQLSCALCNTPVKSELLWQTHVLGKQHREKVAELKG 
AK.c.AbyvjofaAbbArQb VKRKAi' DADDQDVKRAKATLVPQVQPST 
SAWTTNFDKIGKEFIRATPSKPSGLSLLPDYEDEEEEEEEEEGD 
G E RKRGDAS KPLS DAQGKEHS VSSS RE VTS S VLPND FFSTNP P K 
AP 1 1 PHSGS IEKAE IHEKWERRENTAEALPEGFFDDPEVDARV 
RKVDAPKDQMDKEWDEFQKAMRQVNTI S EAI VAEEDE EGRLDRQ 
IGE IDEQ I E CYRRVEKLRNRQDEIKNKLKEI LTI KELQKKEEEN 
ADSDDEGELQDLLSQDWRVKGALL 


6292 


1835 


1142 


TC PGAM KMVAPWTRF YSNS CCLCCHVRTGT I LLGVWYLI INAW 
LLILLSALADPDQYNFSSSELGGDFEFMDDANMCIAIAISLLMI 
L I CAMATYGA YKQRAAW 1 1 PFFCYQI FDFALNMLVAITVLI YPN 
SIQEYIRQLPPNFPYRDDVMSVNPTCLVLI ILLFISI ILTFKGY 
LIS CVWNCYRY INGRNS SDVLVYVTSNDTTVLLPP YDDATVNGA 
AKEPPPPYVSA 


6293 


23B2 


1035 


FWCTLGT VDVH P IGWCA I NS KI LVP PRT I HAKFTDWKG YLMKRL 
VGS RTLPVDFHI KMVESMKYP FRQGMRLE WDKSQVS RTRMAW 
DTVIGGRLRLLYEDGDSDDDFWCHMWSPLIHPVGWSRRVGHGIK 
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MSERRSDMAHHPTFRKIYCDAVPYLFKKVRAVYTEGGWFEEGMK 
LEAIDPLNLGNICVATVCKVLLDGYLMICVDGGPSTDGLDWFCY 
HAS SHAI FPATFCQKNDI ELTP PKG YEAQTFNWENYLE KTKS KA 
APSRLFNMDCPNHGFKVGMKLEAVDLMEPRLICVATVKRWHRL 
LS IHFDGWDS EYDQWVDCES PD I YPVGWCELTGYQLQP P VAAE P 
AT P LKAKEATKKKKKQ FG KKR KRI P PTKTR P LRQGS KKPLLEDD 
PQGARKISSEPVPGEIIAVRVKEEHLDVASPDKASSPELPVSVE 
NIKQETDD 


6294 


354 


1814 


AQLTTRGRTVAGGVRWIPSPFPDLELYSCCL»GTDRGFPELSHHC 
KNV I AT ASD YDMAE I TNI RPS FDVS P WAGIi I GAS VL WCVS VT 
VFVWSCCHQQAEKKHKNPPYKF IHMLKGIS I YPETLSNKKKI I K 
VRRDFCDGPGREGGRRNLLVDAAEAGLIiSRDKDPRGPSSGSCIDQ 
LP I KMDYGEELRSP I TS LTPGES KTTS PSS PEEDVMLGSLTFS V 
DYNFPKKALVVTIQEAHGLPVMDDQTQGSDPYIKMTILPDKRHR 
VKTRVLRKTLDPVFDETFTFYGIPYSQLQDLVLHFLVLSFDRFS 
RDD V I G E VMVP LAG VDPS TGKVQLTRD 1 1 KRN I QKC I S RG ELQ V 
SLSYQPVAQRMTWVLKAJUiLQKMDIAGLSGNPYVKVNVYYGRK 
RIAKKKTHVKKCTLNPIFNESFIYDIPTDLLPDISIEFLVIDFD 

RTTKNEWGRL I LGAHS VTASGAEHWREVCES PRKPVAKWHSLS 
EY 


6295 


2795 


617 


VS S ALLTGATSGS DAAKS EGAS AS PLS CTNAVAMDRPDEGP PAK 
TRRLSSSESPQRDPPPPPPPPPLLRLPLPPPQQRPRLQEETEAA 
QVLADMRGVGLGPALPP P P PYV I LEEGG I RAYFTLGAECPGWDS 
TIESGYGEAPPPTESLEALPTPEASGGSLEIDFQWQSSSFGGE 
(ahjjCi 1 ^£>MVL»WA_fyKlj vUfKbKisKAl 1 X VEDEDEDERESMRSSR 
RRRRRRRRKQRKVKRESRERNAERMESILQALEDIQLDLEAVNI 
KAGKAFLRLKRKFIQMRRPFLERRDLIIQHIPGFWVKAFLNHPR 
IS I LINRRDEDI FRYLTNLQVQDLRHI SMGYKMKLYFQTNPYFT 

FFSWFSNHSLPEADRIAEIIKNDLWVNPLRYYLRERGSRIKRKK 
QEMKKRKTRGRCEWIMEDAPDYYAVEDIFSEISDIDETIHDIK 
I SD FMETTD YFE TTDNE I TD I NEN I CDS ENP DHNEVPNNE TTDN 
NE S ADDHETTDNNE S ADDNNENPEDNNKN1TJDNEENPNNNENT Y 
GNN F FKGG FWGS HGNNQDS S DS DNEADE ASDDEDNDGNEGDNEG 
SDDDGNEGDNEG S DDDDRDI E YYE KV I E D FDKDQAD YEDV I E 1 1 
S DES VE EEG I EEG I OODE t> I YE EGNYEE Rf5 <? v nuwF jt rtF nonnc 
DLEDVLQVPNGWANPGKRGKTG 


6296 


727 


1199 


RHCGCDAQGACDSLPPTGTSS PVTARNA I PEARCCVWLLDGTTV 
EAVR PARE R LAR KELRQKRMQQ FS RDS A YSS NKDS TCLLT ERDT 
LGTSLQFPSPFSGTISFGSFSDSGIFPLGSQCCLGFQQFSISGK 
KWALIHKRVRLSVFGARWGRI YFGK 


6297 


1 


922 


QRAAAASPSSOSPRGAEYGAIjMAMEGYWRFLALLGSALLVGFLS 
VIFALVWVLHYREGI^VTOSALEFNWHPVIiMVTGFVFIQGIAI I 
VYRLPWTWKCSKLLMKSIHAGLNAVAAILAIISVVAVFENHNVN 
NIAI^SUiSWVGLIAVICYLLQLI^GFSVFIiPWAPLSLRAFL 
MPIHVYSGIVIFGTVIATALMGLTEKLIFSIiRDPAYSTFPPEGV 
FVNTLGLLILVFGALIFWIVTRPQWKRPKEPNSTILHPNGGTEQ 
GARGSMPAYSGNNMDKSDSELNOTVAARK^NIALDEAGQRSTM 


6298 


3 


985 


SVPLPJUiSLSGTLQGAGTTTKMAVARLAAVAAWVPCRSWGWAAV 
P FGPHRGLS VLLAR I PQRAPR WLPACRQ KTS LSFLNR PDL PNLA 
YKKLKGKS PGI IFIPGYLSYMNGTKALAIEEFCKSLGHACIRFD 
YS GVGSSDGNSE ESTLGKWRKDVLS 1 1 DDLADGPQ ILVGS SLGG 
WLMIiHAAIARPEKWALIGVATAADTLVTKFNQLPVELKKEVEM 
KGVWSMPSKYSEEGVYNVQYSFIKEAEHHCLLHSPIPVNCPIRL 
LHGMKDD I VPWHTSMQ VADRVLSTD VDV I LRKHSDHRMRE KAD I 
QLLVYTIDDLIDKLSTIVN 
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Codon, /-possible nucleotide deletion, 
\»poesible nucleotide insertion) 


' 6299 


512 


814 


ECDLEGIMPNVTISLSLPTNGSPLQDILVHPCVTSLDSAILTSS 
SIDAMDDSAFSGPYKFPFTPPLESFNLCFYTSQVPVPPILGFYQ 
MKEEEVQLRNNH 


6300 


121 


692 


AAPS CW SQ RG VP AAGT P S S PRLL VS RAAAP S AG P WGAW RQGARA 
AQS PFSI PNS SS VP YGSQDS VHSS PEDGGGGRDRPVGGS PGGPR 
LVIGS LP AH LS PHMFGG FKCP VCS KFVSSDEMD LHLVMCLTKPR 
ITYNEDVLSKDAGECAICLEELQQGDTIARLPCLCIYHKGCIDE 
WFEVNRSCPEHPSD 


6301 


616 


284 


GKFVPVNWEPPQPLPFPKYLRCYRCLLETKELGCLLGSDICLTP 
AGSSC I TLHKKNS SGSDVMVSDCRS KEQMSDCSNTRTS P VSGFW 
I FSQYCFLD FCNDPQNRGLYTP 


6302 


490 


745 


I FG FLHLFHME HS FLLVCAL FAH VF FS SSCGS S VALHS D P CLLS 
PVLLNCLPGDLRPLDELYAQKXKYKAISEELDHALNDMTSL 


6303 


2 


1961 


YWNE YGGGLLWQS WQEKHPGQALS SE P WNFPDTKEEWEQH YSQL 
YWYYLEQFQYWEAQGWTFDASQSCDTDTYTSKTEADDKNDEKCM 
KVDLVS FLSS P IMGDNDSSGTSDKDHSEILDGISNI KLNSEEVT 
QSQLDSCTSHDGHQQLSEVSSKRECPASGQSEPRNGGTNEESNS 
SGNTNTDPPAEDSQKSSGANTSKDRPHASGTDGDESEEDPPEHK 
PSKLKRSHELD I DENPASDFDDSGS LLGFKYGSGQKYGG I PNFS 
HRQVRYLEKNVKLKSKYLDMRKQ I KMKNKHIFFTKES EKPFFKK 
SKILSKVEKFLTWVNKPMDEEASQESSSHDNGHDASTSCDSEEQ 
DMSVKKGDDLLETNNPEPEKCQSVSSAGELETENYERDSLbATV 
PDEQDCVTQEVPDSRQAETEAEVKKKKNKKKNKBCVNGLPPEIAA 
VPELAKY WAQ R YRL FSR FDDG I KLDREG WFS VTP E K I AEH I AGR 
VSQSFKCDVWDAFCGVGGNTIQFALTGMRVIAIDIDPVKIALA 
RNNAE VYG I AD K I E F I CGD FLLLAS FL KAD WFLS P P WGG PD YA 
TAETFDIRTMMSPDGFEIFRLSKK1TNNIVYFLPRNADIDQVAS 
LAGPGGQVEIEQNFLNNKLKTITAYFGDLIRRPASET 


6304 


1 


1438 


HRARVDRSRES PGGDLRHPGRVRRD I TLSGHPRLSTQH WLLRE 
DEVGDPGTKDLGHPQHGSPIQETQSEWTLVSPLPGSDMAALPA 
WRATSGLTLWPKTAEGRDLLGAENRALTGGQQAEDPTLASGAYQ 
WPGS VEKLQGS VWCDAETLLS SSRTGGQAP PWLTDHDVQMLRLL 
AQGEWDKARVPAHGQVLQVGFSTEAALQDLSSPRLSQLCSQGL 
CGLI KR PGDLP E VLS FHVDRVLG LRRS L PAVARRFHS P LLP YRY 
TDGGARPVI WWAPD VQHLSDPDEDQNSLALGWLQYQALLAHSCN 
WPGOAPCPGIHHTEWARLALFDFLLQVHDRLDRYCCGFEPEPSD 
P CVEE RLRE KCRN P AE LRL VH I LVRS S D P S HL VY I DNAGNLQH P 
EDIOjNFRLLEGIDGFPESAVKVLASGCLQNMLLKSLQMDPVFWE 
SQGGAQGLKQVLQTLEQRGQVLLGHIQKHNLTLFRDEDP 


6305 


95 


420 


nmiwrgrstyrprprrsvpppeligpmlepgdeepqqeepptes 
rdpapgqereedqgaaetqvpdleadlqelsqsktgdecgdgpd 
vqgkiltkseqfkmpegr 


6306 


1 


1874 


PTRPSKVKVPHTFLIHSYTRPTVCQACKKLLKGLFRQGLQCKDC 
KFNCHKRCATRVPNDCLGEALINGDVPMEEATDFSEADKSALMD 
ESEDSGVIPGSHSENALHASEEEEGEGGKAQSSLGYIPLMRWQ 
SVRHTTRKSSTTLREGWVVHYSNKDTLRKRHYWRIjDCKCITLFQ 
NNTTNRYYKE I PLSE ILTVESAQNFSLVPPGTNPHCFEI VTANA 
TYFVGEMPGGTPGGPSGQGAEAARGWETAIRQALMPVILQDAPS 
APGHAPHRQAS LS ISVSNSQ I QENVD IATVYQ I FPDEVLGSGQ F 
GVVYGGKHRKTGRDVAVKVIDKLRFPTKQESQLRNEVAILQSLR 
HPGIVNLECMFETPEKVFWMEKLHGDMLEMILSSEKGRLPERL 
TKFLI TQ I LVALRHLHFKNT VHCDLKPENVLLASADP FPQVKLC 
DFGFAR 1 1 GEKS FRRS VVGTPAYLAPEVLLNQGYNRS LDMWS VG 
VI MYVSLSGTFPFNEDED INDQ I QNAAFMYPAS PWSHISAGAID 
LINNLLQVKMRKRYSVDKSLSHPWLQEYQTWLDLRELEGKMGER 
YITHESDDARWEQFAAEHPLPGSGLPTDRDLGGACPPQDHDMQG 
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LAERISVL 


6307 


2136 


589 


CFLLPRGRDPEPPEAGAAAPCAPGAPDMSFRKWRQSKFRHVFG 
QPVTCNDQCYEDIRVSRVTWDSTFCAVNPKFIAVIVEASGGGAFL 
VLPLSKTGRIDKAYPTVCGHTGPVLDIDWCPHNDEVIASGSEDC 
TVMVWQ I P ENGLTS P LT E P VWLEGHTKR VG 1 1 AWH P TARNVLL 
S AGCDNWL I WNVGTAE E LYRLDS LH PDL I YNVS WNHNGS L FCS 
ACKDKS VRI IDPRRGTLVAEREKAHEGARPMRAI FIiADGKVFTT 
GFSRMSERQLALWDPENLEEPMALQELDSSNGALLPFYDPDTSV 
VYVCGKGDSSIRYFEITEEPPYIHFLNTFTSKEPQRGMGSMPKR 
GLEVS KCE IARFYKLHERKCEP I VMTVPRKSDLFQDDLYPDTAG 
PEAALEAEEVTVSGRDADPILISLREAYVPSKQRDLKISRRNVLS 
DSRPAMAPGSSHLGAPASTTTAADATPSGSLARAGEAGKLEEVM 
QELRALRALVKEQGDR I CRLEEQLGRMENGDA 


6308 


2 


1118 


GRPTR P EKMLLSLVLHT YS MRYLLPS WLLGTAPTYVLAWG VWR 
LLS AFL P AR F YQ ALDDR L Y CVYQ S MVLFFFENYTG VQI LL YGDL 
PKNKENIIYLANHQSTVDWIVADILAIRQNALGHVRYVLKEGLK 
WL P L YG W YFAQHGG I YVKRS AKFNE KEMRNKLQS Y VDAGT PM YL 
VI FPEGTR YNPEQTKVLS AS QAFAAQRGLA VL KHVLTPRI KATH 
VAFDCMKNYLDAIYDVTWYEGKDDGGQRRESPTMTEFLCKECP 
KIHIHIDRIDKKDVPEEQEHMRRWLHERFEIKDKMLIEFYESPD 
PERRKRFPGKSVNSKLSIKKTLPSMLILSGLTAGMLMTDAGRKL 
YVNTW I YGTLLGCLW VT I KA 


6309 


220 


563 


LVAEVKEPCSLPMLSVDMENKENGSVGVKNSMENGRPPDPADWA 
VMDWNYFRTVGFEEQASAFQEQEIDGKSLLlLMTRNDVLTGLQL 
KLGPALKI YE YHVKPLQTKHLKNNS S 


6310 


36 


979 


GPRCWKFLILSSVNCETLRIGKAWPQSSGQERYWTPRTHSSASE 
AQRGSLAELNVAAAGLWADCDQPLYDCPMCGLICTNYHILQEHV 
DLHLEENSFQQGMDRVQCSGDLQLAHQLQQEEDRKRRSEESRQE 
IEEFQKLQRQYGLDNSGGYKQQQLRNMEIEVNRGRMPPSEFHRR 
KADMMESLALGFDDGKTKTSGI I EALHR YYQNAATD VRRVWLS S 
WDHFHSSLGDKGWGCGYRNFQMLLSSLLQNDAYNDCIjKGMLIP 
CIPKIQSMIEDAWKEGFDPQGASQLIIRLQGTKAWIGACEVYIL 
LTSLRV 


6311 


1 


675 


PVWWNS CEGPRLAAAARTGHGVGRRARLACLGEPRVKAAVMLTL 
AS KLKRDDGLKGS RTAATAS DS TRRVS VRDKLLVKE VAELEANL 
PCTCKVHFPDPNKLHCFQLTVTPDEGYYQGGKFQFETEVPDAYN 
MVP PKVKCLT KI WHPNI TETG E I CLS LLREHS I DGTGWAPTRTL 
KDVVWGLNSLFTDLLNFDDPLNIEAAEHHUUDKEDFRNKVDDYI 
KRYAR 


6312 


213 


1400 


GDELVKREAGMKKLPGVGVFGTGSSARVXVPLLRAEGFTVEALW 
GKTEEE AKQ LAEEMN I AFYT S RTDD I LLHQD VDLVC I S I PPPLT 
RQISVKALGIGKNVVCEKAATSVDAFRMVTASRYYPQLMSLVGN 
VLRFLPAFVRMKQLISEHYVGAVMICDARIYSGSLLSPSYGWIC 
DELMGGGGLHTMGTYIVDLLTHLTGRRAEKVHGLLKTFVRQNAA 
I RG I RH VTS DD FC F FQMLMGGG VCS TVT LN FNKPGAF VHE VMW 
GSAGRLVARGADLYGQKNSATQEELLLRDSLAVGAGLPEQGPQD 
VPLLYLKGMVYMVQALRQSFQGQGDRRTWDRTPVSMAASFEDGL 
YMQSWDAIKRSSRSGEWEAVEVLTEEPDTNQNLCEALQRNNL 


6313 


2 


2071 


QRSGAARLAFLPS P FS PACVHRS PLS FHGCWFYFVWFMP LGVL 
FHRRRAHGCTLS CSS FVEQPTAMEAEETMECLQEFPEHHKM I LD 
RLNEQREQDRFTD ITL I VDGHHFKAHKAVLAACSKFFYKFFQEF 
TQE PLVE I EGVS KMAFRHLIEFTYTAKLM I QGEEEANDVWKAAE 
FLQMLEAI KALEVRNKENS APLEENTTGKNRAKKRKIAETSNVI 
TES LPSAESE P VE IEVE I AEGTI EVEDEG I BTLEEVASAKQS VK 
YIQSTGSSDDSALALLADITSKYROGDRKGQIKEDGCPSDPTSK 
QVEGIEIVELQLSHVKDLFHCEKCNRSFKLFYHFKEHMKSHSTE 
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amino acid 
residue of 
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Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine ( R=Arginine, 
S=Serine, T^Threonine, V= Valine, 
W=Tryptophan, Y»Tyrosine, X«Unknown, **Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








SFKCEI CNKR YLRES AWKQHLNC YHLS EGG VS KKQRTGKK I H VC 
QYCEKQFDHFGHFKEHLRKHTGEKPFECPNCMERFARNSTLKCH 
LTACQTGVGAKKGRKKLYECQVCNSVFNSWDQFKDHLVIHTGDK 
PNHCTLCE'LWFMQGKELRRHLSDAHNISERLVTEEVLSVETRVQ 
T E P VTSMT 1 1 EQ VG KVH VL PLLQ VQ VDS AQ VTVE Q VHPDLLQDS 
QVHDSHMSELP EQ VQVS YLE VGRI QTEEGTEVHVEELHVERVNQ 
M P VE VQTE LLE AD LDHVT P E I MNQE ERESSQ ADAAEAAREDHED 
AEDLETKPTVDSEAEKAENEDRTALPVLE 


6314 


2 


2071 


QRSGAARLAFLPSPFSPACVHRSPLSFHGCWFYFVVVFMPLGVL 
FHRRRAHGCTLSCS SFVEQPTAMEAEETMECLQE FPEHHKM I LD 
RLNEQREQDRFTDITLIVDGHHFKAHKAVLAACSKFFYKFFQEF 

TT^ T7 D T . T TT? TT?r , T70V\iAT70TJT TDDTVts vt MTflPiPPir ft Ktrnruvik tits 
X^r.f l-i veil C(j VolSJTAr Krlij ltir J. i I AiU^iyObcil^/^DVWKAAE 

FLQMLEAIKALEVRNKENSAPLEENTTGKNEAKKRK1AETSNVI 
TESLPSAES E P VE I EVE I AEGT I EVEDEGIETLEEVASAKQS VK 
YIQSTGSSDDSALALLADITSKYRQGDRKGQI KEDGCPSDPTS K 
QVEGIEIVELQLSHVKDLFHCEKCNRSFKLFYHFKEHMKSHSTE 
SFKCEICNKRYLRESAWKQHLNCYHLEEGGVSKKQRTGKKIHVC 
QYCEKQFDHFGHFKEHLRKHTGEKPFECPNCHERFARNSTLKCH 
LTACQTGVGAKKGRKKLYECQVCNSVFNSWDQFKDHLVIHTGDK 

TEPVTSMTI IEQVGKVHVLPLLQVQVDSAQVTVEQVHPDLLQDS 
QVHDS HMSEL P EQ VQ VS YLE VGR I QTE EGTEVHVEELHVERVNQ 
MPVEVQTELLEADLDHVTPE IMNQEERESSQADAAEAAREDHED 
AEDLETKPTVDSEAEKAENEDRTALPVLE 


6315 


1 


1015 


l^LAVNVVTTLVLISYCPTATEEAPYWTYLLCALGLFIYQSLDA 
I DGKOARRTNS CS PLGELFDHGfD 7 1 Q TVFMA VCi A <3 T A A R T /5 TV 
PDWFFSCSFIGMFVFYCAHWQTYVSGMLRFGKVDVTEIQIALVI 
VF VLS AFGGATM WD YTI P I LE I KLK I L P VLGFLGG V t FSCSNYF 
HVILHGGVGKNGSTIAGTSVLSPGLHIGLIIILAIMIYKKSATD 
VFEKHPCLYILMFGCVFAKVSQKLWAHMTKSELYLQDTVFLGP 
GLLFLDQYFNNF IDEYWLWMAMVISS FDMVI YFSALCLQ ISRH 
LHLNI FKTACHQAPEQVQVLSSKSHQNNMD 


6316 


1503 


792 


VS AGAGTG IMGGTTS TRRVTFEADENEN I TWKG I RLS ENVI DR 
MKESSPSGSKSQRYSGAYGASVSDEELKRRVAEELALEQAKKES 
EDQKRLKQAKELDRERAAANEQLTRAILRERICSEEERAKAKHL 
ARQL E EKDRVL KKQD AFYKEQLARLE E RS S EFYR VTTEQ YQKAA 
EEVEAKFKRYESHPVCADLQAKILQCYRENTHQTLKCSALATQY 
MHCVNHAKQSMLEKGG 


6317 


102 


839 


P EAQTS AVLARE KGHLPTMRHEAPMQMAS AQDARYGQ KDS SDQN 
FDYMFKLLIIGNSSVGKTSFLFRYADDSFTSAFVSTVGIDFKVK 
TVF KNE KR I KLQ I WDTAGQERYRT ITTAYYRGAMGF I LMYD I TN 
EESFNAVQDWSTQ I KTYSWDNAQVILVGNKCDMEDERVISTERG 
QHLGEQLGFEFFETSAKDNINVKQTFERLVDIICDKMSESLETD 
PAITAAKQNTRLKETPPPPQPNCAC 


6318 


1765 


733 


PWHPLRTLPLHHPHPRPPRAEGREGADSMSHLPGLELRREAPPL 
LGPLLSPFPLPAGSWHRQMLRSSLRFPITNSAGAPCKAAGRMNI 
LAP VRRDR VLAEL P QCLR KE AALHGHKD FH PRVTCACQEHRTGT 
VGFKISKVIWGDLSVGKTCLINRFCKDTFDKNYKATIGVDFEM 
ERFEVLGIPFSLQLWDTAGQERFKCIASTYYRGAQAIIIVFNLN 
DVASLEHTKQWLADALKENDPSSVLLFLVGSKKDLSTPAQYALM 
EKDALQVAQEMKAE YNAVS SLTGENVRE FFFRVAALT FEANVLA 
ELE KSGARR IGDWR rNSDDSNLYLTAS KKKPTCCP 


6319 


88 


717 


AATMRLNQNTLLLG KKWLVP YTS EHVP S R YHE WMKS E E LQRLT 
ASEPLTLEQEYAMQCSWQEDADKCTFIVLDAEKWQAQPGATEES 
CMVGDVNLFLTDLEDLTLGE IEVMIAEPSCRGKGLGTEAVLAML 
SYGVTTLGLTKFEAKIGQGNEPSIRMFQKLHFEQVATSSVFQEV 
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Amino acid segment containing signal peptide 
(A^Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
l»=Leucine, M»Methionine, N=Asparagine # 
P=Proline, Q=Glu t amine , R=Arginine, 
S=Serine, T* Threonine, V=Valine, 
WsTryptophan, Y-Tyroeine, X-Unknown, **Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








TLRLTVSESEHQWLLEQTSHVEEKPYRDGSAEPC 


6320 


90 


1111 


RPRTGREKVAMAAVDS FYLLYRE IARSCNCYMEALALVGAWYTA 
RKSITVI CDFYSL I RLHFIPRLGSRADLI XQYGRWAWS GATDG 
IGKAYAEELASRGLNI ILISRNEEKLQWAKDIADTYKVETDI I 
VADFSSGRE I YLP I REALKDKDVG I LVNNVGVFYPYPQ YFTQLS 
E D KLWD 1 1 NVN I AAAS LM VHWL PGMVER KKGAI VT I S S G S CCK 
PTPQLAAFSAS KAYLDHFSRALQYE YASKG I FVQSL I PF YVATS 
MTAPS NFLHRCSWLVPS PKVYAHHAVSTLG I SKRTTG YWSHS IQ 
FLFAQ YM PE WL WVWGAN I LNRSLRKEALS CTA 


6321 


1418 


341 


HRKAALGALMAGRLLGKALAAVS LS LALAS VT I RSSRCRG I QAF 
RNS FS S S WFHLNTNVMSGSNGSKENSHNKARTS P YPGS KVERSQ 
VPNEKVGWLVE WQD YKPVE YTAVS VLAG PRWADPQ IS E SNFSPK 
FNEKDGHVERKSKNGLYEIENGRPRNPAGRTGLVGRGLLGRWGP 

I PGGMVDPGEKI SATLKREFGEEALNSLQKTSAEKRE IEE KLHK 
LFS QDHLVI YKGYVDD PRNTDNAWMETEAVNYHDETGE I MDNLM 
LE AG D DAG KVKWVD INDKLKLYAS HSQFI KLVAEKRDAHWS EDS 
EADCHAL 


6322 


2047 


1083 


NOE I LKNVF ^ ^RTVO PHFT .KFT.T^ T .HW^VnvrJR'HPnWTn'm/QT c: 
i» w 4 -* - 1 - i —** vx ' rur LictC jjxjij uun o v u v vjx\_nirvjini i un v o ± o 

WSINCCDDGEGSQQEEVISSEDIGASIFNGQKKVLYYADALrEI 
AFWPSPVESLTDSLESNISDQDSDSNMDLMPGILKQPSLTLEL 
FPNHTDNLNSSQRLSPSSRMRKLPQGRPVPPLGPETRVSVVWVE 
RYDDIENFPLSELMTEISTGVETTANSSTSLRSTTLEKEVPVIF 
I HP LNTGLFR I K I QGATGKFNM V I P L VDGM I VS RRALG F L VRQT 
VINICRRKRLESDSYSPPHVRRKQKITDIVNKYRNKQLEPEFYT 
SLFQEVGLKNCSS 


6323 


1 


656 


PASTTDGAQEARVPLDGAFWIPRPPAGSPKGCFACVSKPPALQA 
PAAPAPEPS AS P PMAPTLFPMESKS S KTDSVRAAGAPPACKHLA 
E KKTMTNP TTV I E VY PDTTE VND YYLW S I FNF VYLNFCCLG FI A 
LAYSLKVRDKKLLNDLNGAVEDAKTDRblNITRSGLAASCIMLW 
MALSVIATHRGLRSSASILVAEPHDWNTERPQVTFRERCPAL 


6324 


1 


2061 


EGAGMRRCPCRGSLNEAEAGALPAAARMGLEAPRGGRRRQPGQQ 
RPGPGAGAPAGRPEGGGPWARTEGSSLHSEPERAGLGPAPGTES 
PQAEFWTDOQTEPAAAGLGVETERPKQKTEPDRSSLRTHLEWSW 

WTE LETHGS QTQP ER VKS WADNLWTHQNS SS LQTHPEGACP S KE 
PSADGSWKELYTDGSRTQQDXEGPWTEPYTDGSQKKQDTEAARK 
QPGTGGFQIQQDTDGSWTQPSTDGSQTAPGTDCLLGEPEDGPLE 
EPEPGELLTHLYSHLKCSPLCPVPRLIITPETPEPEAQPVGPPS 
RVEGGSGGFSSASSFDESEDDWAGGGGASDPEDRSGSKPWKKL 
KTVLKYS PFWS FRKHYPWVQLSGHAGNFQAGEDGR I LKRFCQC 
EQRSLEQLMKDPLRPFVPAYYGMVLQDGQTFNQMEDLLADFEGP 
S I MDC KMGS RTYLE EEL VKARE RPRPR KDMYE KMVAVDPG APT P 
EEHAQG AVT KPR YMQWRETMS S TSTLG FR I EG I KKADGTCNTN K 
KKTQALEQ VTKVLEDFVDGDHVI LQKYVACLEELRE ALE I S P F F 
KTH EWGSS LLF VHDHTGLAKVWM I DFG KTVAL P DHQTLS HRL P 
WAEGNREDGYLWGLDNMI CLLQGLAQS 


6325 


1S5 


944 


GLRDPFRRKRRLKPQVKMSNYVNDMWPGSPQEKDSPSTSRSGGS 
SRLSSRSRSRSFSRSSRSHSRVSSRFSSRSRRSKSRSRSRRRHQ 
RKYRRYSRSYSRSRSRSRSRRYRERRYGFTRRYYRSPSRYRSRS 
RSRSRSRGRSYCGRAYAIARGQRYYGFGRTVYPEEHSRWRDRSR 
TRSRSRTPFRLSEKDRMELLEIAKTNAAXALGTTNIDLPASLRT 
VPSAKETSRGIGVSSNGAKPEVSILGLSEQNFQKANCQI 


6326 


23B 


680 


GEPS P ATQQKPS ATGAGVLHQHFSSGH I YVLMGLL PPPWTISFT 
VQTTLQP PGGLPAAP VSGRMAFE PVGRDLARRMVPRAGKRTQTL 
GARRVAAQGARPLPEDRRPKSGERLHVTVAPCWE FVLPS VS LTA 
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(A«Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LsLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q*Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
WsTryptophan, Y-Tyroeine, X- Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 








QAWGG VGQ SAS S G V P 


6327 


1 


1337 


SLARLAPAGGSWMPTQQPAAPSTRAPKPSRSLSGSLCALFSDA 

DSGSGMKA2LPPGPGAVGREMTKEEKLQLRKEKKQQKKKRKEEK 

GAEPETGSAVSAAQCQGPTRELPESGIQLGTPREKVPAGRSKAE 

LRAERRAKQEAERALKQARKGEQGGPP PKAS PSTAGETPSGVKR 

ijral rUVUUulJjKRIjVKKPBRQQVPTRKDYGS 

RQNS LTQ FM S IPS S VIHPAM VRIX5LQ YS QGL VRGSNARCIALLR 

ALQQVIQDYTTPPNEELSRDLVNKLKPYMSFLTQCRPIjSASMHN 

AI KFLNKE I TS VGS SKREEE AKSELRAAI DRYVQE KI VLAAQAI 

SRFAYQKISNGDVILVYGCSSLVSRILQEAWTEGRRFRWWDS 

RPWLEGRHTLRSLVHAGVPASYLLIPAASYVLPEVSTEEKDSKV 

GGEKV 


6328 


1030 


276 


HASAEVTTAAARGLGAMEEEMHTDAKIRAENGTGSSPRGPGCSL 
RH FACEQN L LS R PDGS AS FLQGDTS VLAG VYG P AEVXVS KE I FN 
KATLEVILRPKIGLPGVAEKSRERLIRNTCEAWLGTLHPRTSI 
TWLQWSDAGSLLACCLNAACMALVDAGVPMRALFCGVACALD 
SDGTLVLDPTSKQEKEARAVLTFALDSVERKLLMSSTKGLYSDT 
ELQQCLAAAQAASQHVFRFYRESLQRRYSKS 


6329 


3 


2016 


SS EVAAGGGTRSAMAEGSGE WTVSATGAANGLNNGAGGTSATT 
SNPLSRKLHKILETRLDNDKEMLEALKALSTFFVENSLRTRRNL 
RGDIERKSLAINEBFVSIFKEVKEELESISEDVQAMSNCCQDMT 
S RLQAAKEQTQDL I VKTTKLQSESQKLE I RAQVADAFLS KFQLT 
SDEMSLLRGTREGPITEDFFKALGRVKQIHNDVKVLLRTNQQTA 
GliEIMEQMALLQETAYERLYRWAQSECRTLTQESCDVSPVLTQA 
MEALQDRPNOjYKYTLDEFGTARRSTWRGFIDALTRGGPGGTPR 
P I EMHSHDPLRYVGDMLAWLHQATASEKEHLEALLKHVTTQGVE 
ENIQEWGHirEGVCRPLKVRIEQVIVAEPGAVLLYKISNLLKF 
YHHTI SG I VGNSATALLTTI EEMHLLSKKI FFNSLSLHASKLMD 
KVEL P P P D LG P S S ALN Q T LML L R E VXjAS HDS S WP LD AR Q AD F V 
QVLS CVLDPLLQMCTVSASNLGTADMATFMVNS LYMMKTTLALF 
E FTDRRLEMLQFQ I EAHLDTL I NE QAS YVLTR VGLS Y I YNTVQQ 
HKPECX3SLANMPNLDSVTLKAAMVQFDRYLSAPDNLLIPQLNFL 

SPQQVQTLLS 


6330 


1151 


333 


FF Y YTFYENKTFSRKMVAE KETLS LNKCPDKMPKRTKLLAQQPL 
PVHQPHSLVSEGFTVKAMMKNSWRGPPAAGAFKERPTKPTAFR 
KFYERGDFPIALEHDSKGNKIAWKVEIEKLDYHHYLPLFFDGLC 
EMTFP YEFFARQGIHDMLEHGGNKI LP VL PQLI I PIKNALNLRN 
RQVT CVTLKVLQHL WS AEMVGKALVP YYRQ I LPVLN I FKNMNV 

i. ^ o v> iuj\y _k. lj x o xvr\ I'll* x oL^UXyD a LiEif\C CjK. I uuun/VT Xii 1 A.I V Vr 

TYESCLLN 


6331 


3 


495 


QO^QRVRTRGRRACASATPLEGCVDLSYPRTHAALLKVAQMVTL 
L IAF I CVRS S LWTNYS AYS YFE WT I CDLIMI LAFYLVHLFRFY 
R VLTC I SW P LS ELLHYL I GTLLLL I AS I VAAS KSYNQSG LVAGA 
I FGFMATFLCMAS I WLS YK I SCVTQS TDAAV 


6332 


1 


678 


. VTE SNKFDLVS F I PLLRER I YSNNQ YARQ F 1 1 S W I LVLE S VPDI 
NLLDYLPE I LDGLFQ I LGDNGKEI RKMCEWLGE FLKE I KKNPS 
SVKFAEP4ANILVIHCQTTDDLIQLTAMCWMREFIQLAGRVMLPY 
S SG I LTAVLPCLAYDDRKKS I KEVANVCNQSLMKLVTPEDDELD 
ELRPGOROAEPTPDDALPKQEGTASGEWTPSLHLTSCRGPREPD 
V I G VALGPHLS NQDY FM YVTHT I VAATQRSGS SGS PP FCRQDTG 
KLS TMATH S Q L VKTGTGLE P RQAVS S SH 


6333 


3 


1467 


TRTPSEAEAGGESPQSCVSAAHSDWTAGKPVSLLAPLIPPRSAG 
Q PLT FS PSG RQ P LRSLL VGMC SGSGRRRS S LS PTMR PGTGAERG 
GLMMGHPGMHYAPMGMHPMGQRANMPPVPHGMMPQMMPPMGGPP 
MGQMPGMMSSVMPGMMMSHMSQASMQPALPPGVNSMDVAAGTAS 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F- Phenylalanine , G=Glycine, 
H-Hiatidine, I»Isoleucine, K=Lysine, 
L-Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q-Glut amine , R=Arginine, 
S=Serine, T-Threonine, V^Valine, 
W= Tryptophan, YsTyrosine, X= Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








P ALRKVYDQM PEPRYWSMGS CANGGG YYHYS YS WRGCDR I VP 
VD I Y I PG C P PT AEALIj YG I LQLQR K I KRERRLQ I W YRR 


6342 


2 


1191 


D PRVRAMLATLARVAALRKTCLFS GRGGGRGLWTGRPQS DMNN I 
KPLEGVKILDLTRVLAGPFATMNLGDLGAEVIKVERPGAGDDTR 
TWGPPFVGTESTYYLSVNRNKKS IAVNIKDPKGVXI I KELAAVC 
DVFVENYVPGKLSAMGLGYEDIDEIAPHIIYCSITGYGQTGPIS 
QRAGYDAVASAVSGLMHITGPEVACLSHIAANYLIGQKEAKRWG 
TAHGSIVPYQAFKTKDGYIWGAGNNQQPATVCKILDLPELIDN 
S KYKTNHLR VHNRKE L I K I LS ER FE E ELTS KWL YL FEG S G VP YG 
PINNMKNVFAEPQVLHNGLVMEMEHPTVGKISVPGPAVRYSKFK 
MSEARPP PLLGQHTTHI LKEVLRYDDRAIGELLSAGWDQHETH 


6343 


2 


936 


GTAMVSDEDELNLLVI WDANP I WWGKQALKE S Q FTLS KC I DAV 
M VLGNSHL FMNRS NKLAV I ASH I Q E S RFL YPG KNGRLG D F FGD P 
GNPPEFNPSGSKDGKYELLTSANE VI VEE I KDLMTKSDI KGQHT 
ETLLAGSLAKALCYIHRMNKEVKDNQEMKSRILVIKAAEDSALQ 
YMNFMNVIFAAQKQNILIDACVLDSDSGLLQQACDITGGLYLKV 
PQMP S LLQ YLLWVFLPDQDQRSQL I LPPPVHVD YRAACFCHRNL 
IBIGYVCSVCLSIFCNFSPICTTCETAFKISLPPVLKAKKKKLK 
VSA 


6344 


2508 


147 


TM PTATLGNLRG YGMAS PGLAAP S LTP PQLAT PNLQQF F P Q ATR 
QSLLGPPPVGVPMNPSQFNLSGRNPQKQARTSSSTTPNRKDSSS 
QTMPVEDKSDPPEGSEEAAEPRMDTPEDQDLPPCPEDIAKEKRT 
P APE PE PCE A3 E L P AKRLRS S EE P TE KEP PGQ LQ VKAQPQ ARMT 
V P KQTQTPD LL P EALEAQ VL P RFQ P R VLQVQ AQ VQ S QTQ PR IPS 
TDTQVQ PKLQ KQAQTQTS PEHLVLQQ KQVQPQ LQQEAE PQKQVQ 
PQVQPQAHSQGPRQVQLQQEAEPLKQVQPQVQPQAHSQPPRQVQ 
LQ LQKQ VQTQT YPQ VHTQ AQ P S VQ PQEHP PAQ VS VQP P EQTHEQ 
PHTQPQVSLLAPEQTPVWHVCGLEMPPDAVEAGGGMEKTLPEP 
VGTQVSMEEIQNESACGLDVGECENRAREMPGVWGAGGSLKVTI 
LQSSDSRAFSTVPLTPVPRPSDSVSSTPAATSTPSKQALQFFCY 
ICKASCSSQQEFQDHMSEPQHQQRLGEIQHMSQACLLSLLPVPR 
DVLETEDEEPPPRRWCNTCQLYYMGDLIQHRRTQDHKIAKQSLR 
PFCTVCNRYFKTPRKFVEHVKSQGHKDKAKELKSLEKEIAGQDE 
DHFITVDAVGCFEGDEEEBEDDEDEEEIEVEEELCKQVRSRDIS 
REEWKGS ETYS PNTAYGVDFLVP VMGYI CRI CHKFYHSNSGAQL 
SHCKSLGHFENLQKYKAAKNPSPTTRPVSRRCAINARNALTALF 
TSSGRPPSQPNTQDKTPSKVTARPSQPPLPRRSTRLKr 


6345 


2 


3483 


PRVRTKLILLVNDKKRYERVGGGPKRLGRDVEMEEMIEQLQEKV 
HELEKQNDTLKNRLISAKO^LQTQGYRQTPYNNVQSRINTGRRK 
ANENAGLiQECPRKG I KFQDADVAETPHPMFTKYGNSLLEEARGE 
I RNLENVI QSQRGQ I EELEHLAE I LKTQLRRKEKE IELSLLQLR 
EQQATDQRSN I RDNVEM I KLHKQL VE KSN ALS AMEGKF I QLQE K 
QRTLKI S HDALMANGDELNMQLKEQRLKCCS LEKQLHSMKFSER 
RI EELQDR INDLEKERELLKENYDKLYDSAFS AAHEEQWKLKEQ 
QLKVQ I AQLETALKSDLTDKTEI LDRLKTERDQNEKLVQENREL 
QLQYLEQKQQLDBLKKRIKLYNQENDINADELSEALLLIKAQKE 
QKNGDLS FLVKVDS E INKDLERSMRELQATHAETVQELEKTRNM 
LIMQHKINKDYQMEVEAVTRKMENLQQDYELKVEQYVHLLDIRA 
ARIHKLEAQLKDIAYGTKQYKFKPEIMPDDSVDEFDETIHLERG 
ENLFEIHINKVTFSSEVLQASGDKEPVTFCTYAFYDFELQTTPV 
VRG LH P E YNFTSQ YL VHVNDL FLQ Y IQ KNT I TLEVHQ A YS TE Y E 
7IAACQLKFHEILEKSGRIFCTASLIGTKGDIPNFGTVEYWFRL 
RVPMDQAIRLYRERAKALGYITSNFKGPEHMQSLSQQAPKTAQL 
S STDSTDGNLNELH ITI RCCNHLQSRASHLQPHPYWYKFFDFA 
DHDTAI I PSSNDPQFDDHMYFPVPMNMDLDRYLKSESLS FYVFD 
DSDTQEN I Y I GKVNVPL I SLAHDR C I SGI FE LTDHQKHP AGT IH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q«Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyroeine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








VILKWKFAYLPPSGSITTEDLGNFIRSEEPEWQRLPPASSVST 
L VLAP RPKP RQRLTP VD KKVS FVD I MPHQS D VS QEGS VD E VKEN 
TEKMQQGKDDVSLLSEGQLAEQSLASSEDETEITEDLEPEVEED 
MSASDSDDCIIPGPISKNIKQPSEKIRIEIIALSIiNDSQVTMDD 
TIQRLFVECRFYSLPAEETPVSLPKPKSGOWVYYNYSNVIYVDK 
ENNKAKRD ILKAI LQKQEMPNRSLRFTWSDPPEDEQDLECEDI 
GVAH VDLADM FQEQRDL I EQNIDVFDARADGEG I GKLR VTVE AL 
HALQSVYKQYRDDLEA 


6346 


2921 


533 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSALTPSIWPQEIL 
AKYTQKEESAEQPEFYYDEFGFRVYKEEGDEPGSSLLANSPLME 
DAPQRLRWQAHLEFTHNHDVGDLTWDKIAVSLPRSEKLRSLVLA 
G I PHGMRPQLWMRLSGALQKKRNSELS YREI VKNS SNDET I AAK 
Q I EKDLLRTMPSNACFASMGS IGVPRLRRVLRALAWLYPE IG YC 
QGTGMVAACIiLLFLEEEDAFWMMSAI I EDLLPAS YFSTTLLGVQ 
TDQRVLRHLIVQYLPRLDKLIjOEHDIELST.TTr.HWflTvTn paomr 

DIKLLLRIWDLFFYEGSRVLFQLTLGMLHLKEEELIQSENSASI 
FNTLSD I PSQMEI1ABLLLGVAMRLAGS LTDVAVETQRRKHLAYL 
I ADQGQLLGAGTLTNLSQ WRRRTQRRKS TITALL FGEDDLEAL 
KAKNI KQTELVADLREAI LRVARHFQCTDPKNCS WSRQLPGLL 
PNTALTPPTPLVGLYSLWQELTPDYSMESHQRDHENYVACSRSH 
RRRAKALLDFERHDDDELGFRKNDIITIVSQKDEHCWVGELNGL 
RG WF P AK FVE VLDERS KEYS I AGDDS VTEGVTDL VRGTL CPAL K 
ALFEHGLKKPSLLGGACHPWLFIEEAAGREVERDFASVYSRLVL 
CKTFRLDEDGKVLTPEELLYRAVQSVNVTHDAVHAQMDVKLRSL 
I CVGLNEQVLHLWLE VLCSSL PTVEKW YQPWSFLRS PGWVQ I KC 
E LR VL CCFAFS LSQDWEL P AKREAQQ PLKEG VRDML VKHHL FS W 
DVDG 


6347 


2921 


533 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSALTPSIWPQEIL 
AKYTQKEESAEQPEFYYDEFGFRVYKEEGDEPGSSLLANSPLME 
DAPQRLRWQAHLEFTHNHDVGDLTWDKIAVSLPRSEKLRSLVLA 
GIPHGMRPQLWMRLSGALQKKRNSELSYREIVKNSSNDETIAAK 
QIEKDLLRTMPSNACFASMGSIGVPRLRRVLRALAWLYPEIGYC 
QGTGMVAACLLLFLEEEDAFWMMSAIIEDLLPASYFSTTLLGVQ 
TDQRVLRHLI VQ YLPRLDKLLQEHDI ELS L I TLHWFLTAFAS W 
D I KLLLR I WDLFF YEGS RVLFQLTLGMLHLKEEEL I QS ENS AS I 
FNTLSDIPSG^EDAELLLGVAMRLAGSLTDVAVETQRRKHLAYL 
IADQGQLLGAGTLTNLSQWRRRTQRRKSTITALLFGEDDLEAL 
KAKNI KQTELVADLREAILRVARHFQCTDPKNCSWSROtiPGLL 
PNTALTPPTPLVGLYSLWQELTPDYSMESHQRDHENYVACSRSH 
RRRAKALLDFERHDDDELGFRKNDIITIVSQKDEHCWVGELNGL 
RGWFPAKFVE VLDERS KEYS IAGDDSVTEGVTDLVRGTLCPALK 
ALFEHGLKKPSLLGGACHPWLFIEEAAGREVERDFASVYSRLVL 
CKTFRLDEDGKVLTPEELLYRAVQSVNVTHDAVHAQMDVKLRSL 
ICVGLNEQVLHLWLEVLCSSLPTVEKWYQPWS FLRSPGWVQIKC 

ELRVLCCFAFSLSQDWELPAKREAQQPLKEGVRDMLVKHHLFSW 
DVDG 


6348 


3 


3679 


AGAEKCFVTLLACFLAKQQNKYKYEECKDL I KSMLRNELQFKE E 
KLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGRDASRSLN 
EHLQALLTPDE PDKS QGQDLQEQLAEGCRLAQHL VQKLS P ENDN 
DDDEDVQVEVAEKVQKSSSPRBMQKAEEKEVPEDSLEECAITCS 
NSHGPCDSNQPHKNIKITFEEDEVNSTLWDRESSHDECQDALN 
ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQLA 
EKKQQFRNLKEKCFLTQLACFLANQQNKY KYE ECKDL I KFMLRN 
ERQFKEE KLAEQLKQAEELRQYKVL VHS QERELTQLREKLREGR 
DASRSLNEHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQK 
LSPENDNDDDEDVQVEVAEKVQKSSAPREMPKAEEKEVPEDSLE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first I 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D*Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K*Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R-Arginine, 
SsSerine, T=Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^.possible nucleotide deletion, 
\»possible nucleotide insertion) 








BCAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 
EDAVHI I PENESDDEEEEEKGPVSPRNLQESEEEEVPQESWDEG 
Y S TLS I P P EMLAS YKS Y S S TFKS LE EQQVCMAVD IGRHRWDQVK 
KEDHEATGPRLSRELLDEKGPBVLQDSLDRCYSTPSGCLELTDS 
CQ P YRS AP YVLE QQR VG LAVNMDE I E KYQE VE E DQD P S C PRLSR 
ELLDEKE PEVLQDSLGRCYSTPSGYLELPDLGQP YSS AVYS LEB 
QYLGLALDVDR I KKDQEEEEDQGPPCPRLSRELLEVVEPEVIiQD 
SLDRCYSTPSS CLEQPDS CQPYGSS F YALEEKH VGFSLDVGE IE 
KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 
PEVLQDSLDRCYSTPSGCLELTDSCQPYRSAFYILEQQRVGLAV 
DMDE I E KYQE VEE DQD P S C PRLS GELLDE KB P E VLQE S LDR CYS 
TPSGCLELTDSCQPYRSAFYILEQQRVGLAVDMDEIEKYQEVEE 
DQDPSCPRLSRELIjDEKEPBVLQDSLGRCYSTPSGYLELPDLGQ 
PYSSAVYSLEEQYLGIiALDVDRIKKDQEEEBDQGPPCPRLSREL 
LEWEPEVLQDSLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKH 
VG FS LDVGE I E KKGKGKKRRGRRS KKERRRGRKEGEEDQNP PCP 
RLNSMIiMEVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 
SFEEEH ISFALYVDNRFFTLTVTSLHLVFQMGVI FPQ 


6349 


3 
• 


3^79 


AGAEKCFVTLLACFLAKQQNKYKYEECKDLIKSMLRNEIjQFKEE 
KLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGRDASRSLN 
EHLQALLTPDEPDKSQGODLQEQLAEGCRLAQHLVQKLSPENDN 
DDDEDVQVEVAEKVQKSSSPREMQKAEEKEVPEDSLEECAITCS 
NSHGPCDSNQPHKNIKITFEEDEVNSTLWDRESSHDECQDALN 
ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQLA 
EKKQQ FRN LKE KCFLTQLACFLANQQNKYKYE ECKDLI KFMLRN 
ERQFKEEKLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGR 
DASRSLNEHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQK 
LSPENDNDDDBDVQVEVAEKVQKSSAPREMPKAEBKEVPEDSLE 
ECAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 
EDAVHI I PENESDDEEEEEKGPVSPRNLQESEEEEVPQESWDEG 
YSTLS I P P EMLAS YKS YS S TFHS LE EQQVCMAVD I GRHRWDQ VK 
KEDHEATG PRLS REL LDE KGP E VLQDS LDRCYSTP S G CLELTDS 
CQPYRSAFYVLEQQRVGLAVNMDEIEKYQEVEEDQDPSCPRLSR 
ELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 
QYLGLALDVDR I KKDQEE E EDQGP PC PRLSRELLE WE PE VLQD 
SLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKHVGFS LDVGE IE 
KKGKGKKRRGRRS KKERRRGRKEGEEDQNPPCPRLSRELLDEKG 
PEVLQDS LDRCYSTPSGCLELTDSCQP YRSAFY I LEQQRVGLAV 
DMDEIEKYQEVEEDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 
TPSGCLELTDS CQP YRSAFY I LEQQRVGLAVDMDE I EKYQE VEE 
DQDPSCPRLSRELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQ 
PYSSAVYSLEEQYLGLALDVDRIKKDQEEEEDQGPPCPRLSREL 
LE WEPEVLQDSLDRC YST PS S CLEQPDS CQP YGS S F YALEEKH 
VGFS LD VG E I EKKG KG KKRRGRRS KKERRRGR KEGE EDQNP PCP 
RLNSMLMEVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 
S FE E EH I S FAL YVDNRF FTLT VT S LHLVFQMGVI F PQ 


6350 


3 


3679 


AGAE KC FVTLLAC FLAKQQNKY KYEECKDLIKS MLRNE LQ FKEE 
KIAEQLKQAEELRQYKVLVHSQERELTQIiREKLREGRDASRSLN 
EHLQALLTPDE PDKSQGQDLQEQLAEGCRLAQHLVQKLS PENDN 
DDDEDVQ VEVAE KVQKSS S PREMQKAEEKEVPEDSLE E CAITCS 
NSHGPCDSNQPHKNIKITFEEDEVNSTLWDRESSHDECQDALN 
ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQLA 
EKKQQFRNLKEKCFLTQLACFLANQQNKYKYEECKDL I KFMLRN 
ERQFKEEKLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGR 
DASRSLNEHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQK 
LSPENDNDDDEDVQVEVAEKVQKSSAPREMPKAEEKEVPEDSLE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L*Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan , Y-Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








ECAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 
EDAVHI I PENESDDEEEEEKGPVSPRNLQESEEEEVPQESWDEG 
YSTLS I PPEMLAS YKS YSSTFHSLEEQQVCMAVDIGRHRWDQVK 
KEDHEATGPRLSRELLDEKGPEVLQDSLDRCYSTPSGCLELTDS 
CQPYRSAFYVLEQQRVGLAVNMDEIEKYQEVEEDQDPSCPRLSR 
ELLD EKE P E VLQD S LGR CYS TP S G YLE LPD LGQ P YS S AVY S LEE 
QYLGLALDVDRIKKDQEEEEDQGPPCPRLSRELLEVVEPEVLQD 
SLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKHVGFSLDVGEIE 
KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 
PEVLQDSLDRCYSTPSGCLELTDSCQPYRSAFYILEQQRVGLAV 
DMDEIEKYQEVEEDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 
TPSGCLEIiTDSCQPYRSAFYILEQQRVGLiAVDMDEIEKYQEVEE 
DQDPSCPRLSRELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQ 
P Y S S AV Y S LEEQ YIjG IiAIjDVDR I KKDQE E E E DQGP P C PRLSREL 
LEWEPEVLQDSLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKH 
VG FS LDVGE IEKKGKGKKRRGRRS KKERRRGRKEGEEDQNPPCP 
RLNSMLMEVEEPEVLQDSLDXCYSTPSMYFELPDSFQHYRSVFY 
S FEE EH I S FAL YVDNR F FTLTVTS LHLVFQMGVI FPQ 


6351 


1291 


319 


R E ARRRTE RS QLGRMLVVE VANGRS L VWG AE AVQALR ERLGVGG 
RT VGAJjPRG PRQNSRLG LPLLLM P E EARLIiAE I GAVTLVS APR P 
DSRHHSLALTSFKRQQEESFQEQSALAAEARETRRQELLEKITE 
GQAAKKQKLEQASGASSSQEAGSSQAAKEDETSDGQASGEQEEA 
G P S S S QAG PSNG VAPL P RS ALLVQ LATARPR P VKARP LDWR VQ S 
KDWPHAGRPAHELRYSIYRDLWERGFFLSAAGKFGGDFLVYPGD 
f IjKr rlAni ±AUV-WA± , HJJ 1 1 PJLjyJJLVAAijKIiGTSVRKTLLLCSPQ 
PDGKWYTSLQWASLQ 


63S2 


235 


923 


WSEWLSPCHAAKCKGLSMLRITMKTRAISLAADATEFVQGRSAP 
AMARSLVHDTVFYCXSVYQVKISPTPQIiGAASSAEGHVGQGAPG 

PAQAAMEGPQPENMQPRTRRTKFTLLQVEELESVFRHTQYPDVP 
TRRE LAENLG VTED KVR VW F KNKRARCRRHQRE LMLANELRADP 
DDCVYIWC 


6353 


65 


672 


RFAGAGAI PEARAR PPDVQAAEEEKEMDLPDS ASRVFCGR ILSM 
VNTDDVNAI ILAQKNMLDRFEKTNEMLLNFNNLSS ARLQQMSER 
FLHKTRTLVEMKRDLDSIFRRIRTLKGKLARQHPEAFSHIPEAS 
FLEEEDEDPIPPSTTTTIATSEQSTGSCDTSPDTVSPSLSPGFE 
DLSHVQPGSPAINGRSQTDDEEMTGE 


6354 


965 


510 


PSLRPMEPTRDCPLFGGAFSAILPMGAIDVSDLRPVPDNQEVFC 
HPVTDQSLI VELLELQAKVRG EAAAR YH FEDVGG VQGARAVHVE 
SVQPLSLENLALRGRCQEAWVLSGKQQ IAKENQQVAKDVTLHQA 


6355 


158 


1662 


RGSSAAFRGSGLRGAMIRRVLPHGMGRGLLTRRPGTRRGGFSLD 
WDGKVSE I KKKI KS I LPGRS CDLLQDTSHLPP EHSDWI VGGGV 
LGLS VAYWL fCPCLE S RRG AI R VL VVERDHT YS QAS TG LS VGG I CQ 
QFSLPENIQLSLFSASFLRNINEYLAWDAPPLDLRFNPSGYLL 
LASEKDAAAMESNVKVQRQEGAKVSLMS PDOLRNKFPWINTEG V 
ALAS YGM EDEG WFDP WCL LQGLRRKVQ S LG VLFCQGEVTRFVS S 
SQRMLTTDDKAWLKRIHEVHVKMDRSLEYQPVECAIVINAAGA 
WSAQIAAIAGVGEGPPGTLC^TICLPVEPRKRYVYVWHCPQGPGL 
ETPLVADTSGAYFRREGLGSNYLGGRS PTEQEE PDPANLEVDHD 
FFQDKVW PHLALRVPAF ETL KVQ S AWAG YYD YNT FDQNGWG PH 
PLWNMYFATGFSGHGLQCAPGIGRAVAEMVLKGRFQTIDLSPF 
LFTRFYLGEKIQENNII 


6356 


354 


633 1 


TGLTSS CL P LQVMMTKRTKDMG KFSS VTVS TIDEEEEEI EARE V 
ADSYAQNAXVIEKQLERKGMSKRRLQELAELSAKKAKMKGTLID 
NQFK 
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SEQ 

ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 1 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine / 
H=Histidine / I=Isoleucine / KsLysine, 
L»Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y*Tyrosine, X=Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6357 


2 


915 


GLLRNMALLVRVLRKQTS ISQWVPVCSRLI PVSPTQGQGDRALS 
RTSQWPQMSQSQACGGSEQIPGIDIQLNRKYHTTRKLSTTKDSP 
QPVEEKVGAFTKI IKAMGFTGPLKYSKWKIKIAALRMYTSCVBK 
TDFEEFFLRCQMPDTFNSWFLITLLHVWMCLVRMKQEGRSGKYM 
CRI I VH FM WED VQQRGR VMGVNP Y I LKKNM I LMTNHFYAAI LG Y 
DEG I LS DDHGLAAALWRTFFNRKCEDPRHLELLVE YVRKQ I QYL 
DSMNGEDLLLTGE VS WRPLVE KNPQS I LKPHSPTYNDEGL 


6358 


2009 


1040 


ASDALHSLSAPVLRLSSRSAARPATMTEQAISFAKDFLAGGIAA 
AI S KTAVAP I ER VKLLLQ VQHAS KQ I AADKQ YKG XVDCI VRIPK 
EQGVLSFWRGNLANVIRYFPTQALNFAFKDKYKQIFLGGVDKHT 
QFWRYFAGNLASGGAAGATSLCFVYPLDFARTRLAADVGKSGTE 
RE FRGLGDCLVK I TKS DG I RGLYQGFSVS VQG 1 1 IYRAAYFGVY 
DTAKGMLPDPKNTHIWSWMIAQTVTAVAGWSYPFDTVRRRMM 
MQS GR KG AD I M YTGT VD CWRK I FRDEGG KAFFKGAW SNVLRGMG 
GAFVLVLYDELKKVI 


6359 


98 


1086 


VCRQEEEKMKEDCIiPSSHVPISDSKSIQKSELLGIiLKTYNCYHE 
GKS FQ LRHREE EGTL 1 1 EGLLN I AWGLRR P I RLQMQDDRE QVHL 
PSTSWMPRRPSCPLKEPSPQNGNITAQGPSIQPVHKAESSTDSS 
GPLEEAEEAPQLMRTKSDASCMSQRRPKCRAPGEAQR I RRHRFS 
INGHFYNHKTSVFTPAYGSVTNVRVNSIWTTLQVLTLLLNKFRV 
EDGPSEFALYIVHESGERTKLKDCEYPLISRILHGPCEKIARIF 
LMEADLG VEVPHE VAQYI KFEMPVLDSFVEKLKEEEERE 1 1 KLT 
MKFQALRLTMLQRLEQLVEAK 


6360 


1 


345 


GTRGAVPSTLEEWIiPPRSCRVFWIHSGTTMSKVSFKITLTSDP 
RLP YKVLS VP ES T P FTAVL KFAAEE FKVPAAT S A I 1 TNDG I G I N 
PAQTAGNVFLKHGSELRI I PRDRVGSC 


6361 


615 


158 


RPGLGQLQHCALAPQAGNRRCRFHGRLHALTRSTHRGKPMSIMQ 
FKDTLNT PLPDS S PVAVPLGAPI AVASTLS VEHNDGVETG I WAC 
APGRWRRQ ITSQE FCHF 1QGRCTFTPDD3ETLHI QAGDALMLPA 
NSTG I VTOIQETVRKTYVLIL 


6362 


350 


1576 


TTMDGSHSAALKIjQQLPPTSSSSAVSEASFSYKENLIGALLAIF 
GHLWS IALNLQKYCH I RLAGSKDPRAYFKTKTWWLGLFLMLLG 
ELGVFASYAFAPLSLIVPLSAVSVIASAIIGIIFIKEKWKPKDF 
LRRYVLSFVGCGLAWGTYLLVTFAPNSHEKMTGENVTRHLVSW 
PFLLYMLVE 1 1 LFCLLLYFYKEKNANNI WILLLVALLGSMTW 
TVKAVAGMLVLS I QGNLQLD YP I F YVMFVCMVATAVYQAAFLSQ 
ASQMYDS S LIAS VG YILSTTI AITAGAI FYLDFI GEDVLH I CMF 
ALGCLIAFLGVFLITRNRKKPIPFEPYISMDAMPGMQNMHDKGM 
TVQPELKASFSYGALENNDNISEIYAPATLPVMQEEHGSRSASG 
VPYRVLEHTKKB 


6363 


21 


1201 


RRTRLGSSFPRRRDSSAMESYDVIANQPWIDNGSGVIKAGFAG 
DQIPKYCFPNYVGRPKHVRVMAGALEGDIFIGPKAEEHRGLLSI 
RYPMEHGIVKDWNDMERIWQYVYSKDQLQTFSEEHPVLLTEAPL 
NPRKNRERAAEVFFETFNVPALFISMQAVLSLYATGRTTGWLD 
SGDGVTHAVPI YEGFAMPHS I MRID I AGRDVSRFIiRL YLRKEG Y 
DFHSSSEFEIVKAIKERACYLSINPQKDETLETEKAQYYLPDGS 
TIEIGPSRFRAPELLFRPDLIGEESEGIHEVLVFAIQKSDMDLR 
RTLFSNIVLSGGSTLFKGFGDRLLSEVKKLAPKDVKIRISAPQE 
RLYSTWIGGS ILASLDTFKKMWVSKKEYEEDGARS IHRKTF 


£364 


21 


1201 


RRTRLGSSFPRRRDSSAMESYDVIANQPWIDNGSGVIKAGFAG 
DQIPKYCFPNYVGRPKHVRVMAGALEGDIFIGPKAEEHRGLLSI 
RYPMEHGIVKDWNDMERIWQYVYSKDQLQTFSEEHPVLLTEAPL 
NPRKNRERAAEVFFETFNVPALFISMQAVLSLYATGRTTGVVTjD 
SGDGVTHAV P I YEG FAMPHS I MR IDI AGRDVS RFLRL YLR KEG Y 
DFHSSSEFE IVKAI KERACYLSINPQKDETLETEKAQYYLPDGS 
TIEIGPSRFRAPELLFRPDLIGEESEGIHEVLVFAIQKSDMDLR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K*Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q^Glutamine , R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








RTLFSNIVLSQaSTLFKGFGDRLLSEVKKLAPKDVKIRISAPQE 
RLYSTWIGGSILASLDTFKKMWVSKKEYEEDGARSIHRKTF 


6365 


234 


1989 


KHKSRASC^ARAQAFGPSREREVHSRFRSGIJUlLGESNSGCCTM 
ASMGTIAFDEYGRPFLIIKDQDRKSRLMGLEALKSHIMAAKAVA 
NTMRTS LG PNGLDKMMVD KDGD VTVTNDGAT I LSMMD VDHQ I AK 
LMVE LS KSQD DE I GDGTTG VVVLAGALLEEAEQ LLDRG I H P I R I 
ADGYEQAARVAIEHLDKISDSVLVDIKDTEPLIQTAKTTLGSKV 
VNS CHRQMAE I AVNAVLTVADMERRD VDFEL I KVEGKVGGRLED 
TKLIKGVIVDKDFSHPQMPKKVEDAKIAILTCPFEPPKPKTKHK 
LDVTS VEDYKALQKYEKE KFEEM I QQ I KETGANLAI CQWG FDDE 
ANHLLLQNNLPAVRWVGGPEIELIAIATGGRIVPRFSELTAEKL 
GFAGLVQEIS FGTTKDKMLVIEQCKNSRAVTIFIRGGNKMI IEE 
AKRS LHDAL C V I RNL I RDNR WYGGGAAE IS GALA VSQEAD KCP 
TLEQYAMRAFADALEVIPMALSENSGMNPIQTMTEVRARQVKEM 
NPALG I DCLH KGTNDM KQQHVI ETL I GKKQQ I S LATQMVRM ILK 
IDDIRKPGESEE 


6366 


257 


189B 


GNKEGAHSSTFWVLLS I FLGAVAMLCKEQGITVLGLNAVFDI LV 
IGKFNVLEIVQKVI^mDKSLENLGMLRNGGLLFRMTLLTSGGAG 
ML YVRWR IMG TGP PAFT E VDNPAS FADS MLVRAVNYNYY YS LNA 
WLLLCPWWLCFDWSMGCIPLIKSISDWRVIALAALWFCLIGLIC 
QALCSEDGHKRRILTLGLGFLVIPFLPASNLFFRVGFWAERVL 
YLPSVGYCVLLTFGFGALS KHTKKKKLI AAWLG I LFINTLRCV 
I^SGEWRSEEQLFRSALSVCPLNAKVHYNIGKNLADKGNQTAAI 
RYYREAVRLNPKYVHAMNNLGNILKEPJIELQEAEELLSLAVQIQ 
PDFAAAWMNLG I VQNSLKRFEAAEQS YRTAI KHRRKYPDC YYNL 
GRLYADLNRHVDALNAWRNATVLKPEHSLAWNNMIILLDNTGNL 
AQAEAVGREALEL I PNDHSLMFSLANVLGKSQKYKES EALFLKA 
I KANPNAAS YHGNLAVL YHR WGHLDLAKKH YE I S LQLD P TAS GT 
KENYGLLRRKLELMQFCKAV 


6367 


287 


1934 


SIGFPVMLVLSILLYTCEMFQDSVAFEDVAVSFTQEEWALLDPS 
QKNLYRDVMQETFKNXiTSVGKTWKVQNIEDEYKNPRRNLSLMRE 
KLCES KE SHHCG E S FNQ IADDMLNR KTL PG I TPC E SS VCGE VGT 
GHSSLNTHIRADTGHKSSEYQEYGENPYRNKECKKAFSYLDSFQ 
SHDKACTKEKPYDGKECTETFISHSCIQRHRVMHSGDGPYKCKF 
CGKAFYFLNLCLIHERIHTGVKPYKCKQCGKAFTRSTTLPVHER 
THTGVNADECKE CGNAFSFPS E I RRHKRSHTGEKP YECKQCGKV 
FISFSSIQYHKMTHTGEKPYECKQCGKAFRCGSHLQKHGRTHTG 

ekpyecrqcgkafrctsdlqrhekthtedkpygckqcgkgfrca 
s q lq i herthsge kphecke cgkvfky fss lri herthtge kph 
eckc<:gkafryfsslhiherthtgdkpyeckvcgkaftcsssir 

YHERTHTGEKPYECKHCGKAFISNYIRYHERTHTGEKPYQCKQC 
GKAFIRASSCREHERTHTINR 


6368 


1 


327 


RPVPAKLNPRSWPRTAGALPLRPPPLTMAVFHDEVEIEDFQYDE 
DS2TYFYPCPCGDNFS ITKEDLENGEDVATCPSCSLI IKVT TDK 
DQ F VCGET VP AP S ANKELVKC 


6369 


1 


1745 


AG CCRDTRF PTPRG PG S LCHN FCRSAACTVTRT IHGS PREDTGT 
P RS REMM FQDS VAFE D VAVS FTQE E WALLD PSQKNLYRD VMQ E T 
FKNLTSVGKTWKVQNI EDEYKNPRRNLSLMRE KLCES KESHHCG 
E S FNQ I ADDMLNRKTLPGI TP CES SVCGEVGTGHSSLNTHI RAD 
TGHKSSEYQEYGENPYRNKECKKAFSYLDSFQSHDKACTKEKPY 
DGKECTETFISHSCIQRHRVMHSGDGPYKCKFCGKAFYFLNLCL 
IHERIHTGVKPYKCKQCGKAFTRSTTLPVHERTHTGVNADECKE 
CGNAFS FPS E I RRHKRS HTGEKP YECKQCGKVF IS FS S IQ YHKM 
THTGE KP YE CKQCG KAFRCGSHLQKHGRTHTGE KP YE CRQ CGKA 
FRCTSDLQRHEKTHTEDKPYGCKQCGKGFRCASQLQIHERTHSG 
EKPHECKECGKVFKYFSSLRIHERTHTGEKPHECKQCGKAFRYF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P= Proline, Q=Glut amine, R-Arginine, 
S«Serine, T-Threonine , V»Valine, 
W-Tryptophan, Y»Tyrosine, X»Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SSLHIHERTHTGDKPYECKVCGKAFTCSSSIRYHERTHTGEKPY 
EC KHCX3 KAF I SN YI RYHERTHTGEKPYQCKQCGKAF IRAS SCRE 
HERTHTINR 


€370 


1711 


329 


FVLS EQRLRTERTW PRS PGLGRGAAAAGARTAGAGLLRLLLGCG 
ALVGGLR P VTMTTP ANAQNAS KTW E LS L YELHRTPQ EAI MDGTE 
I AVS PRSLHSELMCP I CLDMLKNTMTTKECLHRFCSDCIVTALR 
SGNKECPTCRKKLVSKRSLRPDPNFDALISKIYPSREEYEAHQD 
RVLIRLSRLHNQQALSSSIEEGLRMQAMHRAQRVRRPIPGSDQT 
TTMSGGEGEPGEGEGDGEDVSSDSAPDSAPGPAPKRPRGGGAGG 
SSVGTGGGGTGGVGGGAGSEDSGDRGGTLGGGTLGPPSPPGAPS 
PPE PGGE I ELVFRPHPLLVEKGE YCQTR YVKTTGNATVDHLS KY 
LALR I ALERRQQQEAGE PGG PGGG AS DTGG P DG CGG EGGGAGGG 
DG P EEPALPSLEGVSEKQ YTI Y I APGGGAFTTLNGSLTLELVNE 
KFWKVSRPLELCYAPTKDPK 


6371 


3 


288 


G VANM S TAMN FGTKS FQPRPPDKGS FPLDHLGECKS FKEKFMKC 
LHNNNFENALCRKE S KE YLE CRM ERKLMLQE P LE KLGFGDLTSG 
KSEAKK 


6372 


2141 


625 


R VSAI AS EG KAE ER YKKLEDLLEKS FS LVKM P S LQP WM C VMKH 
LP KVP EKKLKLVMAD KEL YRACAVE VRRQ IWQDNQALFGDEVSP 
LLKQY I LEKES ALFSTELSVLHNFFS PSPKTRRQGEWQRLTRM 
VG KNVKL YDM VLQFLRTL FLRTRNVH YCTLRAE LLMS LHDLD VG 
E I CTVDPCHKFTWCLDAC I RERFVDS KRARELQG FXiDGVKKGQE 
QVU3DLSM I LO)PFAI NTIiALSTVRHLQELVGQETLPRDS PDLL 
LLLRLIALGQGAWDMIDSQVFKEPKMEVELITRFLPMLMSFLVD 
DYT FNVDQKLPAEEKAPVS YPNTLPES FTKFLQEQRMACEVGLY 
YVLHI TKQRNKNALLRLLPGLVETFGDLAFGD I FLHLLTGNLAL 
LADEFALEDFCSSLFDGFFLTASPRKENVHRHALRLLIHLHPRV 
APSKLEALQKALEPTGQSGEAVKELYSQLGEKLEQLDHRKPSPA 
QAAETPALELPLPSVPAPAPL 


6373 


67 


711 


PS RAARAS PARLPAMVS W 1 1 S RLWL I FGTLYPAYYS YKAVKS K 
DI KEYVKWMMYW 1 1 FALFTTAETFTD I FLCWFP FYYELK1AFVA 
WLLS P YTKGSS LL YRKFVHPTLSS KE KE IDDCLVQAKDRS YDAL 
VHFGKRGLNVAATAAVMAASKGQGALSERLRSFSMQDLTTIRGD 
GAPAPSGPPPPGSGRASGKHGQPKMSRSASESASSSGTA 


6374 


535 


2105 


HKLFCSYISTSEFPSSTRHHSCPTHTFCNYTSSTIFLSSTRDHS 
CPTHTFCNYTS ST I FLS S TR DHSC PTHTS CNYTS S T I FLS S TRD 
HSCPTHTSCNYTSSTIFLSSTRDHSCPTHTFCNYPRPIIRLSSC 
CPAELQTEGSNGKKEVLSGFQWLEDTVLFPEGGGQPDDRGTIN 
D I S VLR VTRRG E QADHFTQTPL DPG S Q VL VRVD WERR FDHMQQH 
SGQHL I TAVADHLFKLKTTS WELGRFRSAI ELDTPSMTAEQVAA 
IEQSVNEKIRDRLPVNVRELSLDDPEVEQVSGRGLPDDHAGPIR 
WN I EG VDSNM CCGTHVSNLS DLQVI KI LGTEKGKKNRTNL I FL 
SGNRVLKWMERSHGTEKALTALLKCGAEDHVEAVKKIjQNSTKI l 
QKNNLNLLRDLAVH IAHSLRNSPDWGGVVILHRKEGDSEFMNI I 
ANE I GS E ETLLFLTVGDEKGGGLFLLAG PPAS VBTLGPRVAE VL 
EGKGAGKKGRFQGKATKMSRRMEAQALLQDYISTQSAKE 


6375 


1 


1535 


AIMAAATRPVRLPEAGCEGRERCWNPSRSRSHSGEGGLAAWSRT 
CPGRPRRPGQQWRG PTMLVTAYLAFVGLLAS CLGLELS RCRAK 
PPGRACSNPSFLRFQIjDFYQVTFLALAADWLQAPYIjYKLYQHYY 
FLEGQIAI LYVCGLASTVLFGLVASSIiVDWLGRKNS CVLFSLTY 
SLCCLTKLSQDYFVLLVGRALGGLSTALLFSAFEAWYIHEHVER 
HDFPAEWI PATFARAAFWNHVLAWAGVAAEAVAS WIGLGPVAP 
FVAAI PLLALAGALALRNWGENYDRQRAFSRTCAGGLRCLLSDR 
RVLLLGTIQALFESVIFIFVFLWTPVLDPHGAPLGI I FSSFMAA 
SLLGSSLYRIATSKRYHLQPMHLLSLAVLIWFSLFMLTFSTSP 
GOES PVES FI AFLL I ELACGL YFPS MS FLRRKVT PETEQAGVLN 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

amino acid 

amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

aim' nn a r t H 

sequence 


Amino acid segment containing signal peptide 
(A=*Alanine, C-Cyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P~Proline, Q=Glutamine, R=Arginine, 
S^Serine, T*Threonine, V-Valine, 
w« ixypcopnan , x»iyrosine, x*un)cnown, *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








WFRVPLHSLACLGLLVTjHDSDRKTGTRNMFSICSAVMVMALLAV 
VOLir 1 V VKtiDAEJjKVPSPTEEPlAPEL 


6376 


380 


1437 


ISSTDIDHYRFSFLVNSKMPSKESWSGRKTNRAAVHKSKQEGRQ 
QDLL I AALGMKLpGS P KS S VT I WQPLKL FAYS QLTS LVRRATL KE 
NEQIPKYEKIHNFKVHTFRGPHWCEYCANFMWGLIAQGVKCADC 
GLNVHKQCS KMVPNDCKPDLIuiVXKVYSCDLTTLVKAHTTKRPM 
WDMC I RE I ESRGLNS EGLYRVSGFSDL I EDVKMAFDRDGEKAD 
I S VNM YE D I N 1 1 TGALKL YFRDLPI PL I T YDAYPKFIESAKIMD 
PDEQLE T LHE ALKLL P P AHCETLRYLMAHL KR VTLHE KENLMNA 
ENLG I VFG PTLMRS PELDAMAALNDIRYQRLWELLI KNED I LF 


6377 


2311 


1845 


SRIRRRSSRRPREPPGPSRRRRRRRPDPRTMPSEKTFKQRRTFE 
QRVEDVRLIREQHPTKI PVI I ERYKGBKQLPVLDKTKFLVPDHV 
NMSELIKIIRRRLQLNANQAFFLLVNGHSMVSVSTPISEVYESE 
KDEDGFLYMVYASQETFGMKLSV 


6378 


686 


191 


GAGPWEAFPDGIGRRSRRARLPQYKRPPGRVGGGDSGRRNMAVA 
DLAL I PD VDI DSDG VFKYVL I RVHS APRSGAPAAES KE I VRG YK 
WAE YHAD I YDKVSGDMQKQGCDCE CLGGGR I SHQSQDKKI HVYG 
YSMAYGPAQHAI STE KI KAKYPDYE VTWANDGY 


6379 


35 


378 


ERAGSPSPSRAALRRCAPQRSQAPRWPDRAACRRSFQGSQGRAY 
LFNS WNVGCGP AEERVLLTGLHAVAD I YCENCKTTLGWKYEHA 
FES S Q K Y KEG KYI I ELAHM I KDNGWD 


6380 


1414 


462 


PAVQGQRGAG P PTGRGSGNMAR FALT WRHG E TRFNKE KI I QGQ 
GVDE PLS ETGFKQAAAAG I FLNNVKFTHAFS SDLMRTKQTMHG I 
LERSKFCKDMTVKYDSRLRERKYGWEGKALSELRAMAKAAREE 
CP VFTP PGGETLDQVKMRG I DFFE FLCQL I LKEADQKEQFSQGS 
PSNCLETS LAE I FPLGKNHSSKVNSDSGI PGLAAS VLWSHGAY 
MRSLFDYFLTDLKCSLPATLSRSELMSVTPNTGMSLFIINFEEG 
R E VKPTVQ C I CMNLQDHLNGLTENS LGLNLP S KS NH FE PL KG VP 
LALFTSLLC 


6381 


1668 


218 


AWRAQGSRG FSGAGWRPRQAAAMNFS EVFKLS SLLCKFS PDGK 
YLAS CVQ YRLWRDVNTLQ I LQLYTCLDQIQHI EWS ADSLF I LC 
AMYKRGLVQVWSLEQPEWHCKIDEGSAGLVASCWS PDGRHILNT 
TEFHLR I TVWSLCTKS VS Y I KYPKACLQGITFTRDGR YMALAER 
RDCKDYVS I FVCSDWQLLRHFDTDTQDLTGIEWAPNGCVLAVWD 
TCLEYKILLYSLDGRLLSTYSAYEWSLGIKSVAWSPSSQFLAVG 
SYDGKVRILNHVTWKMITEFGHPAAINDPKIVVYKEAEKSPQLG 
LGCLS F P PPRAG AG PL PS S ES KYE IAS VP VSLQTL KP VTDRANP 
KIG IGMLAFS PDS YFLATRNDNI PNAVWVWD IQKLRLFAVLEQL 
SPVRAFQWDPQQPRLAICTGGSRLYLWSPAGCMSVQVPGEGDFA 
VLS LCWH LSGDS MALLS KDH FCLC FLE TEAWGTACRQLGGHT 


6382 


2 


1062 


FEEDEDRNLCL IAYPLKGDHGI VDI VDNSDCEPKS KLLRWTTNK 
KHHVLETEKTPKDWVRQHRKEEKMKSHKLEEEFEWLKKSEVLYY 
TVEKKGN I SSQLKHYNPWSMKCHQQQLQRMKENAKHRNQ YKFIL 

AVI GWVCGMQVYQ AGSGQIiM FMNKYHGR KLS VQG F KEALFQFF 
HNGRYLRRELLGP VLKKLTELKAVLERQES YRF YS S SLLVI YDG 
KERPEWLDSDAEDLEDLSEESADESAGAYAYKP I GASS VDVRM 
I DFAHTTCRLYGE DTWHEGQDAGYI FGLQS LID I VTE I SEESG 
E 


6383 


3159 


1061 


S PAPGR PS P HGSQ P AARAAAAP AMP S AKQRGS KGGHG AAS PS E K 
GAHPSAARPLAAPTPAAPACRS PS PGGAPAS FPGRAPRSLASQP 
AARAAAAPAMPSAKQRGSKGGHGAAS PS E KGAHPSGGADDVAKK 
P P P APQQP P P P P A P H P QQHPQQHPQNQAHGKGGHRGGGGGGG KS 
S SS S S ASAAAAAAAASSS AS CSRRLGRALNFLFYLALVAAAAFS 
GWCVHKVLEEVQQVRRSHQDFSRQREELGQGLQGVEQKVQSLQA 
TFGTFESILRSSOHKQDLTEKAVKQGESEVSRISEVLQKLQNEI 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 

y-ci a A Hue of 

amino acid 
sequence 


Amino acid segment containing signal peptide 
<A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K*Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q*Glutamine, R=Arginine, 
j-ocxuje, i*inreonine, v**vaj.ine, 
W«Tryptophan, Y»Tyrosine, X=Un)cnown, **Stop 
Codon, /-possible nucleotide deletion, 
\=poesible nucleotide insertion) 








LKDLSDGIHWKDARERDFTSLENTVEERLTELTKSINDNIAIF 
TEVX3KRSQKEINDMKAKVASLEESEGNKQDLKALKEAVKEIQTS 
AKSREWDMEALRSTLQTMESDI YTEVRELVSLKQEQQAFKEAAD 
TERLALQALTEKLLRSEESVSRLPEEIRRLEEELRQLKSDSHGP 
KEDGGFRHSEAFEALQQKSQGLDSRLQHVEDGVLSMQVASARQT 

l?C,T.I?CTiT.Q VQriT?WT70PT^a&T (VDT TTr*T J~* pcpji rviTV^ T w rumm r» t 

a o ij no u uo ivo y Ei ric>y icuri/VbyvsK ucj o ijOobE>ADyDGIiASXVRSL 
G ETQLVL YGD VE E LKRS VGE L P S TVE S LQ KVQEQVHTLLSQDQ A 
QAARLPPQDFLDRLSSIiDNLKASVSQVEADLKMLRTAVDSLVAY 
S VKI ETNENNLES AKGLLDDLRNDLDRLFVKVEKIHEKV 


6384 


738 


1904 


IWEVPVCLTHLLHLQQANQPLPPPSSSINEEDADEANRAIGEKR 
AAPDSGKKPKTPKTKQQKDPNEPQKPVSAYALFFRDTQAAIKGQ 
NPNATFGEVSQIVASMWDSLGEEQKQVYKRKTEAAKKEYLKALA 
A YRASLVS KAAAE S AEAQTIRS VQQTLAS TNLTS S L LLNTPLS Q 
HGTVSASPQTLQQSLPRS I APKPLTMRLPMNQ I VTS VTIAANMP 
S N I GAP LIS SMGTTMVG SAPS TQ VS P S VQTQQHQMQLQQQQQQQ 
QQQMQQMQQQQLQQHQMHQQIQQQMQQQHFQHHMQQHLQQQQQH 
LQQQINQQQLQQQLQQRLQLQQLQHMQHQSQPSPRQHSPVASQI 
TSPIPAIGSPQPASQQHQSQIQSQTQTQVLSQVSIF 


6385 


2 


1584 


PRVRAADVAAGAQAWS AGMAKSNGENG PRAPAAGES LSGTRES 
LAQGPDAATTDELS SLGSDSEANGFAERR I DKFGFI VGSQGAEG 
ALEEVPLEVLRQRESKWLDMLNNWDKWMAKKHKKIRLRCQKGIP 
PSLRGRAWQYLSGGKVKLQQNPGKFDELDMSPGDPKWLDVIERD 
UttKy r f cajunr VbKvjbn^UUULFRVLiKAYTLYRPEEGYCQAQAP 
IAAVLLMHMPAEQAFWCLVQICEKYLPGYYSEKLEAIQLDGEIL 
FSLLQfCVSPVAHKHLSRQKIDPLLYMTEWFMCAFSRTLPWSSVL 
RVWDMFFCEGVKIIFRVGLVLLKHALGSPEKVKACQGQYETIER 
LRSLSPKIMQEAFLVQEWELPVTERQIEREHLIQLRRWQETRG 
r-ljy V..K& r'lr'KJjrtwU\/\lJjlJA£i tt» FKPALiQPSPSIRLPLDAPLPGS 
KAKP KP PKQAQKE QRKQM KGRGQLE KP P APNQAM WAAAGDACP 
PQHVPPKDSAPKDSAPQDLAPQVSAHHRSQESLTSQESEDTYL 


6386 


819 


195 


T VCG S F YLG I MQRAS RL KRE LHMLATEP PPG I T C WQDKDQMDDL 
RAQ I LGGANTP YE KG VFKLE VI I PER YP FEP PQ I RFLT P I YH PN 

PLMADISSEFKYNKPAFLKNARQWTEKHARQKQKADEEEMLDNL 
PEAGDS RVHNS TQ KRKASQ LVGI E KKFH P DV 


6387 


1 


662 


PGPTHASADAWADAWAQPNMAMHNKAAPPQIPDTRRELAELVKR 
KQELAETLANLERQIYAFEGSYLEDTQKYGNIIRGWDRYLTNQK 

REPGSGTESDTSPDFHNQENEPSQEDPEDLIX5SVQ^3VKPQKAAS 
STSSGSHHSSHKKRKNKNRHSPSGMFDYDFEIDLKLNKKPRADY 


6383 


1 


662 


KQELAETLANLERQIYAFEGSYLEDTQMYGNIIRGWDRYLTNQK 
NSNS KNDRRNRKFKEAERLFS KSS VTSAAAVSALAG VQDQLI EK 
REPGSGTESDTSPDFHNQENEPSQEDPEDLDGSVQGVKPQKAAS 
S TSSG S HHS SH KKRKNKNRH S PSGM FD YD FE IDLKLNKKP RAD Y 


6389 


1074 


497 


AE PGDRMAGHRL VL VLGDLH I PHRCNS LP AXFKKLL VPGK I QH I 
LCTGNLCTICESYDYLKTLAGDVHIVRGDFDENLNYPEQKVVTVG 
QFKIGLIHGHQVIPWGDMASLALLQRQFDVDILISGHTHKFEAF 
EHENK FY INPGS ATGAYNALE TN HPS FVLMDIQAS TVVTYVYQ 
LIGDDVKVERIEYKKP 


6390 


158 


535 


GEERKEGRAPGKAFAPERNPAKMEKEETTRELLLPNWQGSGSHG 
LTIAQRDDGVFVQEVTQNSPAARTGWKEGDQIVGATIYFDNLQ 
S G EVTQ LLNTMGHHTVG LKLHRKGDRF F P S LGQTWD P 


6391 


5386 


2897 


VRWNSKTECYLSIQTQENFPANLNELVNCIVISSLVTTQRKLKA 
MSLU3SRNQLAPAVLNPNPMDFCTKDLLTTTSERIIAYLRDFNE 
DQKKAI ETAYAMVKHS PS VAK I CL I HGP PGTGKS KT I VGLLYRL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D-Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H-Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V*Valine, 
W=Tryptophan, Y=Tyrosine / X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LTENQRKGHSDENSNAKIKQNRVLVCAPSNAAVDELMKKIILEF 
KEKCKDKKNPLGNCGDINLVRLGPEKSINSEVLKFSLDSQVNHR 
MKKE L PS HVQAMHKR KEFLD YQLD ELS RQRALCRGGRE I QRQE L 
DEN I S KVS KERQE LAS KI KE VQGRPQ KTQS III LE SHU C CTLS 
TS GG LLLE S AFRGQGG VP F S CV I VDEAGQS CE I ETLTPLIHRCN 
KL I LVGDP KQLP PTV I S MKAQE YG YDQSMMARFCRLLEENVEHN 
MI SRL P I LQLTVQ YRMHPD ICL FP SN YYYNRNLKTNRQTEAI RC 
SSDWP FQPYLVFDVGDGSERRDNDS YINVQE IKLVME I I KLIKD 
KR KDVS FRN I G 1 1 THY KAQKTM I Q KDLDKE FDRKGP AE VDTVD A 
FQGRQ KDC V I VT CVRANS I QGS I G FLAS LQRLNVT I TRAKYS LF 
ILGHLRTLMENQHWNQLIQDAQKRGAIIKTCDKNYRHDAVKILK 
LKPVLQRSLTHPPTIAPEGSRPQGGLPSSKLDSGFAKTSVAASL 
YHTPSDSKEITLTVTSKDPERPPVHDQLQDPRLLKRMGIEVKGG 
I FLWDPQPS S PQHPGATPPTGEPGFPWHQDLSHVQQPAAWAA 
LSSHKPPVRGEPPAASPEASTCQSKCDDPEEELCHRREARAFSE 
GEQE KCGS ETHHTRRNSRWDKRTLEQEDSS S KKRKLL 


6392 


972 


186 


GRTGVDLAS SMAHRLQIRLLTWDVKDTLLRLRHPLGEAYATKAR 
AHGLEVE PS ALEQG FRQ AYRAQS HS FPN YG LSHGLTS RQ WWLD V 
VLQTFHLAGVQDAQAVAPIAEQLYKDFSHPCTWQVLDGAEDTLR 

KP D PR I FQE ALR LAHM E P WAAHVGDNYLCD YQGPRAVGMHS FL 
WGPQALDPWRDSVPKEHILPSLAHLLPALDCLEGSTPGL 


6393 


2017 


730 


TGG S KMAAVAT CGS VAAS TGSAVATASKSNVTS FQRRG P RAS VT 
NDSGPRLVS IAGTRPSVRNGQLLVSTGLPALDQLLGGGLAVGTV 
LL I E E DKYN I Y S PLL FKYFLAEG I VNGHTLL VAS AKED P ANI LQ 
ELPAPLJjDDKCKKEFDEDVYNHKTPESNIKMKIAVJRYQLLPKME 
I G P VS S SR FGHYYDAS KRM PQE L I EAS NWHG FFL P EK I S S TLKV 
EPCSLTPGYTKLLQFIQNIIYEEGFDGSNPQKKQRNILRIGIQN 
LGS PLWGDD I CCAENGGNSHSLTKFLYVLRGLLRTSLSAC I ITM 
PTHL I QNKAI I AR VTTLS DVWGLES FI GS ERETNPL Y KD YHGL 
IHIRQIPRLNNLICDESDVKDLAFKLKRKLFTIERLHLPPDLSD 


6394 


1418 


511 


GAAAGGEGARRRPAAMATVMAATAAERAVLEEEFRWLLHDEVHA 
VL KQLQ DI L KEASLRFTL PGS GTEG PAKQENF I LGS CGTDQ VKG 
VLTljQGDALSQADVNLKMPRNNQLLHFAFREDKQWKLQQIQDAR 
NHVSQAIYLLTSRDQSYQFKTGAEVLKIM3AVMLQLTRAR1JRLT 
TPATLTLPE IAASGLTRMFAPALPSDLLVNVYINLNKLCLTVYQ 
LHALQPNSTKNFRPAGGAVLHSPGAMFEWGSQRLEVSHVHKVEC 
VIPWLNDALVYFTVSLQLCQQLKDKISVFSSYWSYRPF 


6395 


13 


658 


PSGRPTRPLCCAARRGAARHGGSVSGWPAGRTPTETSNPGSSVM 
ESVTFEDVAVEFIQEWALLDSARRSLCKYRMLDQCRTLASRGTP 

DLL E EAS S RDMQMG PGL FLRMQLVP S I EE RE TPLTREDRP ALQE 
PPWSLGCTGLKAAMQIQRWIPVPTLGHRNPWVARDSGE ! 


6396 


1 


1221 


AN I LS S P S KRGQ KGTLIG YS PEGTPLYNFMGDAFQHS S QS I PRF 
IKESLKQILEESDSRQIFYFLCLNLLFTFVELFYGVLTNSLGLI 
SDGFHMLFDCSALVMGLFAALMSRWKATRIFSYGYGRIEILSGF 
INGLFLIVIAFFVFMESVARLIDPPELDTHMLTPVSVGGLIVNL 
IGICAFSHAHSHAHGASQGSCHSSDHSHSHHMHGHSDHGHGHSH 
GS AGGGMNANMRGVFLHVLADTLGS IG VI VSTVL I EQFGWFI AD 
PLCSLFIAILIFLSWPLIKDACQVLLLRLPPEYEKELHIALEK 
IQKIEGLISYRDPHFWRHSASIVAGTIHIQVTSDVLEQRIVQQV 
TGI LKDAGVNNLTI QVEKEAYFQHMSGLSTGFHDVLAMTKQMES 
MKYCKDGT Y I M 


6397 


391 


122 


GAGGVGRFEAIRAPARM I E WCNDRLG KKVRVKCNTDDT I GDLK 
KLIAAQTGTRWNKI VLKKWYTI FKDHVS LGD YE I HDGMNLEL YY 
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Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








Q 


; 6398 


353 


1306 


HKQMGPLINRCKKILLPTTVPPATMRIWLLGGLLPFLLLLSGLQ 
RPTEGSEVAIKIDFDFAPGSFDDQYQGCSKQVMEKLTQGDYFTK 
D I EAQKNYFRMWQKAHLAWI^O^KVLPQNMTTTHAVAI LFYTLN 
SNVHS DFTRAMASVARTPQQYERS FHFKYLHYYLTSAI QLLRKD 
S I MENGTLCYEVHYRTKDVHFNAYTGATIRFGQFLSTS LLKEE7A 
Q E FGNQTL FT X FTCLGAP VQ Y FS LKKEVL I P P YEL FKV I NMS YH 
PRGDWLQLRSTGNLSTYNCQLLKASSKKCIPDPIAIASLSFLTS 
VIIFSKSRV 


6399 


75 


1245 


PNLETYFGRRCEKDSMNFTPTHTPVCRKRTWSKRGVAVSGPTK 
RRGMADSLESTPLPSPEDRLAKLHPSKELLEYYQKKMAECEAEN 
EDLLKKLELYKEACEGQHKLECDLQQREEE IAE LQKALS DMQVC 
LFQEREHVLRLYSENDRLRIRELEDKKKIQNLLALVGTDAGEVT 
YFCKEPPHKVTILQKTIQAVGECEQSESSAFKADPKISKRRPSR 
ERKES S EHYQRD IQTLI LQVEALQAQLGEQTKIiSREQIEGL I ED 
RRIHIjEEIQVQHQRNQNKIKELTKNLHHTQELLYESTKDFLQLR 
SENQNKEKSWMLEKDNLMS KI KQYRVQCKKKEDKI GKVLPVMHE 
SHHAQSEYIKVMSLCRNEVVYFSGRVEGIPKNLQFVM 


6400 


2520 


1053 


KT*M KC DE VVYEVQS AI LRHNCG YAM KTG KFFHNLMERKD FETWL 
DNISVTFLSLTDLQKNETLDHLISLSGAVQLRHLSNNLETLLKR 
DFLKLLPLELSFYLLKWLDPQTLLTCCLVSKQWNKVISACTEVW 
QTACKNLGWQIDDSVQDALHWKKVYLKAILRMKQLEDHEAFETS 
S L I GHS ARVYAL Y Y KDGLL CTGS DDLS AKLW D VS TGQ C VYG I QT 
HTCAAVKFDEQKLVTGS FDNTVACWEWS SGARTQH FRGHTGAVF 
S VDYNDELDI LVS GS AD FTVKVWALS AGTCLNT L TGHTE WVTKV 
VTiQKCKVKSLLHS PGD YILLSADKYE IKIWP IGRE INCKCLKTL 
S VSEDRS I CLQPRLHFDGKYI VCSS ALGLYQWDFAS YDILRVI K 
TPEIANLALLGFGDI FALLFDNRYLYI MDLRTESLI SRWPLPEY 
RKSKRGSS FLAGEAS WLNGLDGHNDTGLVFATSMPDHS IHLVLW 
KEHG 


6401 


109 


766 


PGAAWSRPDLRGCCTGPQPALRMLVLPSPCPQPLAFSSVETMEG 
PPRRTCRSPEPGPSSSIGSPQASSPPRPNHYLLIDTQGVPYTVL 
VDEESQREPGASGAPGQKKCYSCPVCSRVFEYMSYLQRHSITHS 
EVKPFECD I CGKAFKRASHLARHHS IHLAGGGRPHGCPLCPRRF 
RDAGELAQHSRVHSGERPFQCPHCPRRFMEQNTLQKHTRWKHP 


6402 


1196 


279 


TTSQCGGIRQSSAIPVASMEFAAICLRNALIiLLPEEQQDPKQEN 
GAKNSNQLGGNTES S ES SETCS S KSHDGDKF I PAPPS SPLRKQE 
LENLKCS ILACSAYVALALGDNLMALNHADKLLQQPKLSGSLKF 
LGHLYAAEALI S LDR I S DAITHLNP ENVTDVS LGIS S NEQDQG S 
DKGENEAMES SGKRAPQCYPSS VNS ARTVMLFNLGSAYCLRS E Y 
DKARKCLHQAASMIHPKEVPPEAILLAVYIiELQNGNTQLALQ 1 1 
KRNQLLPAVKTHSEVRKKPVFQPVHPIQPIQMPAFTTVQRK 


6403 


2 


1690 


RG I HTS VLQGNLQNQMYSHNWI MNLNNLNLTQVQQRNLI TNLQ 
RS VDDTSQ AI QR I KNDFQNLQQ VFLQ AKKDTDWLKE KVQS LQTL 
AANNS ALAKANNDTLEDMNS QLNS FTGQMEN I TTI SQANEQNLK 
DLQDLHKDAENRTAIKFNQLEERFQLFETDIVNIISNISYTAHH 
LRTLTSNLNEVRTTCTDTLTKHTDDLTSLNNTLANIRLDSVSLR 
MQQDLMRSRLDTEVANLS VI MEEMKLVDSKHGQLI KNFTI LQGP 
PG P RG P RGD RGS QG P PG PTGNKGQKGE KGE PGP PGPAGERG PIG 
PAGPPGERGGKGSKGSQGPKGSRGSPGKPGPQGPSGDPGPPGPP 
GKEGLPGPQGPPGFQGLQGTVGEPGVPGPRGIiPGLPGVPGMPGP 
KGPPGPPGPSGAWPLALQNEPTPAPEDNSCPPHWKNFTDKCYY 
FSVEKEIFEDAKLFCEDKSSHLVFINTREEQQWIKKQMVGRESH 
WIGLTDSERENEW KWLDGTS P DYKNWKAGQPDNWGHGHG PGE DC 
AGL I YAGQWND FQ C ED VNNF I CE KDRE T VLS SAL 


6404 


1012 


222 


AAALAMAAP AP GLISVFSSSQE LGAALAQLVAQRAACCLAGARA 
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Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








R FALGLSGGS LVSMLARELPAAVAPAGPAS LARWTLG FCDERLV 
P FDHAESTYGLYRTHLLSRL P IPESQVIT INPELP VEEAAEDYA 
KXLRQAFQGDSIPVFDLLILGVGPDGHTCSLFPDHPLLQEREKI 
VAP ISDSPKPPP QR VTLTL P VLNAART V I F VATGBGKAAVLKR I , 
LE DQEBNP L P AALVQ PHTGKLCW FLD EAAARLLTVP FE KHS PL 


6405 


1 


1456 


AAL PR PTPRAP LGREGTGSD S E MAASM FYGRLVAVATLRNHRPR 
TAQRAAAQ VLG S SGLFNNHGLQ VQQQQQRNL S LHE YMS MELLQ E 
AGVS VPKG YVAKS PDEAYAIAKKLGSKDW I KAQVLAGGRGKGT 
FESGLKGGVKIVFSPEEAKAVSSQMIGKKLFTKQTGEKGRICNQ 
VL VCE R K Y P RRE Y Y FAI TME RS FQGP VL I G S SHGG VN I E D VAAE 
TPFAIIKEPIDIEEGIKKEQALQLAQKMGFPPNIVESAAENMVK 
L YS L FLKY D ATM I E I NPMVE D DG A VL CMD AX T N FD <5 W<? A YR OTT 

K I FDLQDWTQEDERDKDAAKANLNYIGLDGN IGCLVNGAGLAMA 
TMD 1 1 KLHGGT PANFLDVGGGATVHQVTEAFKLITS DKKVLAI L 
VN I FGG I M R CD VIAOG T VMAV KD LE T K I P VWR tjOGTR VDDA V A 
L I AD SGLK I LACDDLD E AARMWKLS E I VTLAKQAHVD VKFQL P 

I 


64C6 


1036 


167 


HPRQMRGEDTPEAPPYSSGRYDS I KTEVSGCPEDLTVGRAPTAD 
DDDDDHDDHEDNDKMNDS EGMDPERUCAFNM FVRLFVDENLDRM 
VP I S KQP KE K I QA I I E S CS RQ F PE FQERARKR I RT YLKS CRRMK 
KNGM EMTR PTP PHLTS AMAEN I LAAACE S E TRKAAKRMRL E I YQ 
S S QDEPI ALDKQHSRDSAAI THSTYS LPAS S YSQDP VYANGGLN 
YSYRGYGALSSNLQPPASLQTGNHSNGESGEARALASRPAPSWV 
CRAALGSGMG RG KQR P VM ERG CLT A 


6407 


492 


150 


VG LCLAVS QTVLAQLDALLVF PGQ VAQLS CTLS PQHVT I RDYG V 
S WYQQRAGSAPRYLLY YRS EEDHHRPAD I PDRFS AAKDEAHNAC 
VLTI S PVQ PEDDADY YCS VGYGFS P 


6408 


1458 


903 


RGCITSSQAWRLFGGVTRGFNMRIEKCYFCSGPIYPGHGMMFVR 
NDCKVFRFCKS KCHKNFKKKRNPRKVRWTKAFRKAAG KELTVDN 
S FE FEKRRNEP I KYQRELWNKT I DAMKRVEE I KQKRQAKF IMNR 
LKKNKELQKVQD I KE VKQN I HL I RAPLAGKGKQLEEKMVQQLQE 
DVDMEDAP 


6409 


150 


446 


NTALANLLRCFTCDRLCGGCTAPAP P AHQGI VLQ P VM PS CDPGP 
G PACLPTKTFRS YLPRCHRT YSCVHCRAHLAKHDEL IS KSFQGS 
HGRAYLFNSV 


6410 


85 


607 


RGG TAGCVAC LG CWGQ S S S P KAAF P AGS ACL PADS C PC LLFQAC 
AI SGLFNC I T IHPLNIAAGVWMIMNAFILLLCEAPFCCQFIEFA 
NTVAEKVDRLRS WQKAVF YCGMAWP I VI SLTLTTLLGNAIAFA 
TGVLYGLSALGKKGDAISYARIQQQRQQADEEKLAETLEGEL 


6411 


302 


772 


RLSIMASSLNEDPEGSRITYVKGDLFACPKTDSLAHCISEDCRM 
GAG I AVLFKKKFGG VQELLNQQKKSGEVAVLKRDGR YI YYL I TK 
KRASHKPrYENLQKSLEAMKSHCLKNGVTDLSMPRIGCGLDRLQ 
WENVSAMIEEVFEATD I KI TVYTL 


6412 


61 


1709 


RPVTSFSPLPGSCGGRLGTRTMLGRSLREVSAALKOOQITPTEL 
CQKCLSLIKKTKFLNAYITVSEEVALKQAEESEKRYKNGQSLGD 
LDG I P I AVKDNFSTSG I BTTCASNMLKGYIPP YNATWQ KLLDQ 
GALLMGKTNLDEFAMGSGSTDGVFGPVKNPWSYSKQYREKRKQN 
PHSENEDSDWLITGGSSGGSAAAVSAFTCYAALGSDTGGSTRNP 
AAHCGLVGFKPSYGLVSRHGLIPLVNSMDVPGILTRCVDDAAIV 
IiGALAGPDPRDSTTVHEPINKPFMLPSLADVSKLCIGIPKEYLV 
PELSSEVQSLWSKAADLFESEGAKVIEVSLPHTSYSIVCYHVLC 
TS EVASNMARFDGLQ YGHRCD I DVSTEAMYAATRREG FNDWRG 
R I LSGNFFIiLKENYENYFVKAQKVRRL IANDFVNAFNSGVDVLL 
TPTTLSEAVPYLEFI KEDNRTRSAQDDI FTQAVNMAGLPAVS I P 
VALSNQGLPIGLQFI GRAFCDOOLLTVAKWFEKQVQFPVI QLQE 
LMDDCSAVLENEKLASVSLKQ 
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(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
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P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, TVThreonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6413 


2 


885 


HEPRCAGMAASLWMGDLEPYMDENFI SRAFATMGETVMSVKI IR 
NRLTGIPAGYCFVEFADLATAEKCLHKINGKPLPGATPAKRFKL 
N Y ATYG KQ P DNS P E YSLFVGD LTPD VDDGML YE F FVKVYP S CRG 
GKWLDQTGVSKGYGFVKFTDELEQKRALTECQGAVGLGSKPVR 
IiSVAIPKASRVKPVEYSQMYSYSYNQYYQQYQNYYAQWGYDQNT 
GSYSYSYPQYGYTQSTMQTYEEVGDDALEDPMPQLDVTEANKEF 
MEQSEELYDALMDCHWQPLDTVSSEIPAMM 


6414 


1 


538 


RGGRAALLPWRRFPCCRPRPQPARPSSRATPGPRSPGMATSIGV 
r is voiAj v rttAc.l^AJL7C.FiiiN i i il»Ki' V r yQRrRPSVvKDClHAV 
LKEELANAEYSPEEMPQLTKHLSENIKDKiKEMGFDRYKLMWQV 
VIGEQRGEGVFMASRCFWDADTDNYTHDVFMNDSLFCWAAFGC 
FYY 


6415 


2 


1168 


FVRQWQSSHRRACGLGCEARAGGGEEPRGRASSVAGWVGAFRAP 
F I E AAVAGLGAGSGKRRRGWKMP VHSRGDKKETNHHDEMEVDYA 
ENEGSSSEDEDTESSSVSEDGDSSEMDDEDCERRRMECLDEMSN 
LEKQFTDLKDQLYKERLSQVDAKLQEVIAGKAPEYLEPLATLQE 
NMQ IRTKVAGI YRELCLESVKNKYECE I QAS RQHCESEKLLLYD 
1 vybiiijishKlRRIjEEDRHSIDITSELWNDE 
KKKPGWSGPYIVYMLQDLDILEDWTTIRKAMATLGPHRVKTEP 
PVKLEKHLHSARSEEGRLYYDGEWYIRGQTICIDKKDECPTSAV 
ITTINHDEVWFKRPDGSKSKLYISQLQKGKYSIKHS 


6416 


410 


1519 


E IAPADLE I PACAP VLLS RATSSTMS VTGGKMAPSLTQE I LSHL 
GLASKTAAWGTLGTLRTFLNFSVDKDAQRLLRAITGOGVDRSAI 
VDVLTNRS REQRQL I SRNFQERTQQDLMKS LQAALSGNLERI VM 
JMjIAJ f 1 Ay r iJAyc.J-)K I AJuKA£»IJbAV.U VA1K 1 LiAl RTPPQLiQECIj 

AVYKHNFQ VEAVDG I TS ETSG I LQDLLLALAKGGRDS YSG I ID Y 
NLAEQDVQALQRAEGPSREETWVPVFTQRNPEHLIRVFDQYQRS 
TGQELE^AVQNRFHGDAQVALLGLASVIKNTPLYFADKLHQALQ 
ETEPNYQVLIRILISRCETDIjLSIRAEFRKKFGKSLYSSLQDAV 


6417 


1 


845 


rgesrvlwselegeaggaggwasslnarmdnrfatafviacvls 
l i st i ymaas igtdfwye yrs p vqenssdlnks i wdefi sdead 
ektyndalfrywgwglwrrcitipknmhwysppertbsfdvvt 
kcvsftlteqfmekfvdpgnhnsgidllrtylwrcqfllpfvsl 
glncfgal 1 glcaci crs lypt i atg i lhllaglctlgs vs cyv 
agiellhqklelpdiwsgefx3wsfclacvsaplqfmasalfiwa 
ahtnrkeytlmkayrva 


6418 


2 


662 


TRTPPPRDPflT./7Zi&VriV2inR.PCTCTPft.riacD ft ft ft VORnouDDAU 

TPAP P P PP P CGG IACHG E PAKF YG YDNLQRQ P I FTTQQE AE LVQ 
YPDCKSSSGNIGEDPDHLNQSSSPSQMFPWMRPQAAPGRRRGRQ 
TYSRFQTLELE KE FLFNP YLTRKRR I EVSHALALTERQVKI WFQ 
NRRMKWKKENNKI)KFPVSRQEVKDGETKKEAQELEEDRAEGLTN 


6419 


1 


973 


PGRP R VRNFDLNS KS ILQEFFCTRS I Q I PANRS KTAMSKCP I FP 
MARS I STSG PLDKEDTGRQKLISTGSLPATLQGATDSLGLEWHL 

PSPDP vtvp yls plwwkelesllbnegdhai tvadfvdhhp r V 

FWNLVWYFRRLDLPSNLPGLILSSEHCNKYSKIPRHCMSEDSKY 
VLIQMLWDNMKLHQDPGQPLYILWNAHTQKYPMVHLLQKSDNSF 

nqellksmvksikmndvygpmsqiletlnkcphfkrqrslyrei 
lflslvalgren i d i dafdke ykmaydrltps qvks thncdrp p 
stgvmecrktfgepyl 


6420 


207 


1187 


RKMIDKNQTCGVGQDSVPYMICLIHILEEWFGVEQLEDYLNFAN 
YLLWVFTPLILLILPYFTIFLLYLTIIFLHIYKRKNVLKEAYSH 
NLWDGARKTVATLWDGHAAVWHGYEVHGMEKI PEDG PAL 1 1 F YH 
GAIPIDFYYFMAKIFIHKGRTCRWADHFVFKIPGFSLLLDVFC 
ALHGPREKCVE I LRSGHLLAI S PGGVREALI SDET YNI VWGHRR 
GFAQVAIDAKVPI I PMFTQNIREGFRSLGGTRLFRWLYEKFRYP 
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FAPMYGGFPVKLRTYLGDPIPYDPQITAEELAEKTKNAVQALID""" 


6421 


1844 


362 


WALSLRRQPERMSNKLLSPHPHSWLRSEFKMASSPAVLRASRL 
YQWSLKSSAQFLGS PQLRQVGQI IRVPARMAATLI LEPAGRCCW 
DEPVRIAVRGLAPEQPVTLRASLRDEKGALFQAHARYRADTLGE 
LDLERAPALGGS FAGLE PMG LLWALEP E KP LVRLVKRD VRT P LA 
VELEVLDGHDPDPGRLLCQTRHERYFLPPGVRREPVRVGRVRGT 
LFLPPEPGPFPGIVDMFGTGGGLLEYRASLLAGKGFAVMALAYY 
NYEDLP KTMETLHLE YFEEAMNYLLSHPE VKGPGVGLLGI SKGG 
ELCLSMAS FLKG ITAAWINGS VANVGGTLRYKGETLP PVGVNR 
NRI KVTKDG YADIVDVLNSPLEGPDQKS FI PVERAESTFLFLVG 
QDDHNWKSEFYANEACKRLQAHGRRKPQIICYPETGHYIEPPYF 
PLCRASLHALVGSPIIWGGEPRAHAMAQVDAWKQLQTFFHKHLG 
GREGTIPSKV 


6422 


181 


2133 


EGENLSWFQEFWGDIAKEFYWKtPCPGPFLRYNFDVTKGKIFIE 
WMKGATTNI CYNVLDRNVHEKKLGDKVAFYWEGNEPGETTQITY 
HQLLVQ VCQ FSNVLRKQG I HKGDRVAI YMPMI PEL WAMLACAR 
IGALHSIVFAGFSSESLCERILDSSCSLLITTDAFYRGEKLVNL 
KELADEALQKCQEKGFPVRCCIWKHLGRAELGMGDSTSQSPPI 
KRS CPD VQ I S WNQG I DLWWHELMQEAGDE CE PEWCDAEDPLF I L 
YTSGSTGKPKGWHTVGGYMLYVATTFKYVFDFHAEDVFWCTAD 
IGWI TCjHS Y VT YGPLANGATS VLFEG I PTYPDVNRLWS I VDKY K 
VTKFYTAPTAIRLLMKFGDEPVTKHSRASLQVLGTVGEPINPEA 
WLWYHRWGAQRCPIVDTFWQTETGGHMLTPLPGATPMKPGSAT 
FPFFGVAPAILNESGEELEGEAEGYLVFKQPWPGIMRTVYGNHE 
RFETT YFKKF PG Y YVTGDGCQRDQDGYYW I TGRI DDMLNVSGHL 
LSTAEVESALVEHEAVAEAAWGHPHPVKGECLYCFVTLCDGHT 
FSPKLTEELKKQIREKIGPIATPDYIQNAPGLPKTRSGKIMRRV 
LRKIAQNDHDLGDMSTVADPSVISHLFSHRCLTIQ 


6423 


614 


1237 


ANLKE I PRDL P PE TVLL YLDSNQ I TS IPNEI FKDLHQLRVLNLS 
iuvvji ar xutsnAr A.v»viui iJjy lljDijoUWRiQSVHKJjAFNNLKARA 
RIANNPWHCDCTLQQVLRSMASNHETAHNVI CKTSVLDEHAGRP 
FLNAANDADLCNL PKKTTDYAMLVTMFGWFTMVI S YWYYVRQN 
Q EDARRHLE YLKS LPS RQ KKADE PDD I S TW 


6424 


1 


11B8 


KKVSWPVAAMVHCSCVLFRKYGNFIDKLRLFTRGGSGGMGYPRL 
GGEGGKGGD VWWAHNRMTLKQLKDR YPRKRF VAGVGANSK I SA 
JjR*jols.vjlsJJWnlfVf v\j1o VIDENCaKI IGELNKENDRILVAqGGL 
GGKLLTNFLPLKGQKR I IHLDLKLIADVGLVGFPNAGKSSLLSC 
VSHAKPAIADYAFTTLKPELGKIMYSDFKQISVADLPGLIEGAH 
MNKGMGHKFLKHIERTRQLLFWDISGFQLSSHTQYRTAFETII 
LLTKELELY KEELQTKPALLAVNKMDLPDAQDKFHELMS QLQNP 

KT1FT>HT]FRKMN T P'RRTVTF PO"RT TDTQ AVTY2TmTiri?T btvtptdvpt 
rujr unu; Liiu^iiiraivi viir v n J. JLir icnv 1 \9ciO± J!i1!iJjIvN\« J,KJ\oL 

DEQANQENDALHKKQLLNLWISDTMSSTEPPS KHAVTTSKMDI I 


6425 


1850 


1144 


LAMEGGGGI PLE TL KEESQSRHVLPAS FEVNS LQKSNWGFLLTG 
LVGGTLVAVYAVAT PFVTPALRKVCLPFVPATMKQI ENWKMLR 
CRRGSLVDIGSGDGRIVIAAAKKGFTAVGYELNPWLVWYSRYRA 
WREGVHGS AKF Y I SDLWKVTFSQYSNWI FGVPQMMLQLEKKLE 
RELEDDARVIACRFPFPHWTPDHVTGEG IDTVWAYDASTFRGRE 
KRPCTSMHFQLP IQA 


6426 


30 


565 


SRGAAVGGMSVAGGEIRGDTGGEDTAAPGRFSFSPEPTLEDIRR 
LHAE FAAERDWEQFHQPRNLLLALVGEVGELAELFQWKTDGE PG 
PQGWSPRERAALQEEL5DVLIYLVALAARCRVDLPLAVLSKMDI 
NRRR YPAHLARSS S RKYTELPHGAI S EDQAVGPADI PCDSTGQT 
ST 


6427 


145 


959 


AASWGPPHVTKAGKMVSWMI CRLWLVFGMLCPAYAS YKAVKTK 
N1REYVRWMMYWIVFALFMAAEIVTDIFISWFPFYYEIKMAFVL 



502 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A~Alanine, OCysteine, D=Aspartic Acid, E= 

fll iifami /• A f 1 P — ^h^nvrl a 1 a n ■? n a f*5— f2? •VTO 1 HP 
UlULSmll, t\<~ i.U , Z — irllCIiy J-dXCtllJ-IlC , u-Ol J 1-1I1C r 

H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
SsSerine, T«Threonine, V=Valine, 
W=Tryptophan, Y*Tyrosine, X-Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








WLLSPYTKGASLLYRKFVHPSLSRHEKEIDAYIVQAKERSYETV 
LS FGKRGLNIAASAAVQAATKSQGALAGRLRS FSMQDLRS ISDA 
PAPAYHDPLYLEDQVSHRRPPIGYRAGGLQDSDTEDECWSDTEA 
VPRAPARPREKPLIRSQSLRWKRKPPVREGTSRSLKVRTRKKT 
VPS DVDS 


6428 


1982 


444 


SGSGGKMEDHQHVP I D I QTS KLLDWLVDRRHCS LKWQSLVLT I R 
EKINAAIQDMPESEEIAQLLSGSYIHYTHCLRILDLLKGTEAST 
KNI FGRYSSQRMKDWQE I IALYEKDNTYLVELSSLLVRNVNYEI 
PSLKKQIAKCQQLCX2EYSRKEEECQAGAAEMREQFYHSCKQYGI 
TGENVRGELLALVKDLP SQLAE I GAAAQQSLGEAI DVYQAS VGF 
VCESPTEQVLPMLRFVQKRGNSTVYEWRTGTEPSWERPHLEEL 
PEQVAEDAIDWGDFGVEAVSEGTDSGISAEAAGIDWGIFPESDS 
KDPGGDGIDWGDDAVALQITVLEAGTQAPEGVARGPDALTLLEY 
TETRNQFLDELMELEIFLAQRAVELSEEADVLSVSQFQLAPAIL 
GGQT KE KMVTMVS VLEDL I G KLTS LQ LQHLFMl LAS PR YVDR VT 
EFLQQKLKQSQIiLALKKELMVQKQQEALEEQAALEPKLDLLLEK 
TKELQKLIEADI S KRYSGRPVNLMGTSL 


6429 


3413 


3442 


E P S S W TAAP RG P LAAH P LE AAVQED DR RAL S FD SR I K VFANGTL 
WKSVTDKDAGDYLCVARNKVGDD YWLKVDWMKPAKI EHKEE 
NDHKVF YGGDL KVD CVATG LPNPEISWS LPDGS LVNS FMQS DDS 
3GRT KR YW FNNGT L YFNEVGMREEGD YTCFAENQ VGKD EMR VR 
VKWTAP AT I RNKT CLAVQVP YG D WT VACEAKG E PMP KVTWLS 
PTNKVIPTSSEKYQIYQDGTLLIQKAQRSDSGNYTCLVRNSAGE ' 
DRKTVWIHVNVQPPKINGNPNPITTVREIAAGGSRKLIDCKAEG 
IPTPRVLWAFPEGVVLPAPYYGNRITVHGNGSLDIRSLRKSDSV 
QLVCMARNEGGEARLI VQLTVLEPME KP I FHDP I S EKITAMAGH 
TISLNCSAAGTPTPSLVWVLPNGTDLQSGQQLQRFYHKADGMIjH 
I SGLSS VDAGAYRCVARNAAGHTERLVSLKVGLKPEANKQYHNL 
VSIINGETLKLPCTPPGAGQGRFSWTLPNGMHIiEGPQTLGRVSL 
LDNGTLTVREASVFDRGTYVCRMETEYGPSVTS I PVI VIAYPPR 
ITSEPTPVIYTRPGNTVKLNCMAMGIPKADITWELPDKSHLKAG 
VQARb i GNRFLH P QG S LT I QHATQKDAG FYKCMAKN I LiG S US KT 
TYIHVF 


6430 


1946 


602 


RTRVSTGLRRTLLWSEAVGAS STRGDTG I PGS GEGGAGPGGGEG 
AMLEAMAE PS PEDP P PTLKPETQPPEKRRRTI EDFNKFCS FVLA 
YAGYIPPSKEESDWPASGSSSPLRGESAADSDGWDSAPSDLRTI 

rvrriPl TV V5k VO OVT3D * TV /"\ J\ ^T»WI5^ nT»T3 CTPCOT /^J4 DTSP J\ ipT T WWI 

QTr VKKAKSblCKKAAQAot 1 T. UrQjfPRSTr bK±AJAPDbAlljLiEKM 
KLKDSLFDLDGPKVASPLSPTSLTHTSRPPAALTPVPLSQGDLS 
HPPRKKDRKNRK1GPGAGAGFGVLRRPRPTPGDGEKRSRIKKSK 
KRKLKKAERGDRLPPPGPPQAPPSDTDSEEEEEEEEEEEEBEMA 
T WGGEAPVPVLPTPPE APRP PATVHPEGVPPADS ES KEVGSTE 
TSQDGDAS S SEGEMRVMDED IMVESGDDSWDLI TC YCRKP FAGR 
PMIECSLCGTWIHLSCAKIKKTl^VPDFFYCQKCKEIiRPEARRLG 
GPPKSGEP 


6431 


3 


605 


WWNS S YN L PAY A P YLP C EAC AMQDGR KGG A YAG KMEATT AG VG R 
LEEEALRRKSRLKALREKTGRKDKEDGEPKTKHLREEEEEGEKH 
RELRLRNYVPEDEDLKKRRVPQAKPVAVEEKVKEQLEAAKPEPV 
IEEVDIJ^IAPRKPDWDLKRDVAKKLEKLKKRTQRAIAELIRER 
LKGQEDSLASAVDAATEQKTCDSD 


6432 


56 


1692 


GGLG TMGS R I KQN P ETT FE VYVE VA Y P RTGGTLSD P E VQ RQ F P E 
D YSDQE VLQTLTKFCFP FYVDSLTVSQ VGQNFTF VLTD I DSKQR 
FGFCRLSSGAKSCFCI LS YLPWFEVFYKLLNI LADYTTKRQENQ 
WNELLETLHKLPIPDPGVSVHLSVHSYFTVPDrRELPS I PENRN 
LTE YFVAVD VNNMLHL YASML YE RR I L 1 1 CS KLST LTAC I HG S A 
AMLYPMYWQHVYIPVLPPHLLDYCCAPMPYLIGIHLSLMEKVRN 
MALDDVVILNVDTNTLETPFDDLQSLPNDVISSLKNRLKKVSTT 



503 



WO 01/53312 



PCT7US00/34263 



SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F=Phenylalanine , G^Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X- Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
Vpossible nucleotide insertion) 








TG DG VARAFLKAQAAF FG S YRNALKI E P E B P I T FCEEAFVS HYR 
SGAMRQFLQNATQLQLFKQFIDGRLDLLNSGEGFSDVFEEEINM 
GEYAG5DKLYHQWLSTVRKGSGAILNTVKTKANPAMKTVYKFDI 
AENGCAPTPEEQLPKTAPS PLVEAKDPKLREDRRPITVHFGQVR 
P PRPHWKRP KSNIAVEGRRTS VPSPEQNTIATPATLH ILQKS I 
TKFAAKFPTRGWTSSSH 


6433 


1524 


484 


APVTKRKEVFAKDSKGSALDAGRDPKRPALPETLCESGWASNTA ' 
PTTP PQPGWCLCGKDFKS S CQTPGRE KERRLATMHGS CS FLMLL 
LPLLLLLVATTGPVGALTDEEKRLMVELHNLYRAQVSPTASDML 
HMRWDEELAAFAKAYARQCVWGHNKERGRRGENLFAITDEGMDV 
PLAMEEWHHEREHYNLSAATCSPGQMCGHYTQWWAKTERIGCG 
SHFCEKLQGVEETNIELLVCNYEPPGNVKGKRPYQEGTPCSQCP 
SGYHCKNSLCE P IGSPEDAQDLP YLVTEAPSFRATEASDSRKMG 
AEGPDKPSWSGLNSGPGHVWGPLLGLLLLPPLVLAGIF 


6434 


40 


2002 


MPQLNFGMADPTQMGGLSMLLLAGEHALGTPEVFSGTCRPDVSE - 

S PELRQKS PL FQ FAE I S S S TS HSD AS TKQCQTS AL FQ FAE ISSN 

TSQLGGAEPVKRCGKSALFQLAEMCLASEGMKMEES KL I KAKES 

DGGRIKELEKGKEEKEIKMEKTDETRLQKEAEFEKSAKENLRDS 

KELRNFEALQIDDIMAIKMEDPKEIRKEELEEDHKCSHFPDFSY 

SASSKII ISDVPSRKDHMCHPHGIMI IEDPAALNKPEKLKKKKK 

KSKMDRHGNDKSTPKKTCKKRQSSESDIESVIYTIEAVAKGDWG 

IEKLGDTPRKKVRTSSSGKGS ILDAKPPKKKVKSREKKMSKEKS 

SDTTKESRPPDFIS I SASKNI SGETPEGI KAEPLTPMEDALPPS 

LSGQAKPEDSDCHRKIETCGSRKSERSCKGALYKTLVSEGMLTS 

LRANVDRGKRSSGKGNSSDHEGCWNEESWTFSQSGTSGSKKFKK 

TKPKEDCLLGSAKLDEEFEKKFNSLPQYSPVTFDRKCVPVPRKK 

KKTGNVSSEPTKTSKGSGDKWSNKQLFLDAIHPTEAIFSEDRNT 

ME PVHKVKN I PS I FNTPEPTTTARTFGGQ PKEKS KENPDYS PCQ 

DTQRAGYHHEEVLWMTNI/^NNCGGVYLKQLRHTAMTNA 


6435 


2227 


657 


ALQRDAAAAYAHPE YEERFLQEETVS QQINS I ELLQTRPLALPE 
WKSQRPLQRQVHLRGRPAS QPTVIRG ITYYKAKVS E EEND I EE 
QQDEFFSGDNGVDLL I EDQLLRHNGLMTS VTRRPAATRQGHS TA 
VTS DLNARTAPWS S AL PQ P S TS DPS IANHAS VGPTLQTTSVS PD 
PTRES VLQ PS PQVP ATTVAHTATQQPAAP AP PAVS PREALMEAM 
HTVP VPPTTVRTDS LGKDAPAGRGTTPAS PTLSPEEEDDIRNVI 
GRCKDTLS T I TGPTTQNTYGRNEG AWMKD PLAKDER I YVTNYYY 
GNTL VE FRNLENFKQGRWSNS YKL P Y S W I GTGHWYNGAF YYNR 
AFTRNI I KYDLKQR Y VAAWAMLHDVA YEEATP WRWQGHS DVDFA 
VDENGLWLIYPALDDEGFSQBVIVLSKLNAADLSTQKETTWRTG 
LRRNFYGNCFVICGVLYAVDS YNQRNANI S YAFDTHTNTQ I VPR 
LLFENEYFYTTQ I DYNPKDRLLYAWDNGHQVTYHVI FAY 


6436 


1295 


341 


GACRPPVRQDPDSGPDYEALPAGATVTTHMVAGAVAGILEHCV^J 
Y P I D CVKTRMQS LQ P D PAAR YRNVLE ALWR 1 1 RTEGLWRPMRGL 
NVTATGAGPAHALYFACYEKLKKTLS DVI HPGGNSHIANGAAGC 
VATLLHDAAMN P AE WKQRMQM YNS P YHRVTD C VRAVWQNEGAG 
AFYRSYTTQLTMNVPFOAIHFMTYEFLQEKFNPQRRYNPSSHVL 
S GACAGAVAAAATTP LD VCKT LLNTQ ES LALNSH I TGH I TGMAS 
AFRTVYQ VGGVTAY FRG VQARVI YQ I PSTAI AWS VYEFFKYLI T 
KRQEEWRAGK 


6437 


1828 


360 


PPAPAPPAS PARHVTRTARGHXiEGGSRAPPLLQAVFLQI KNMVK 
L I HTLADHGDD VNCCAFS FS LLATCS LDKT I RL YS LRDFTEL PH 
S PLKFHTYAVHCCCFS PSGHILASCSTIX3TTVLWNTENGQMLAV 
MEQPSGSPVRVCQFSPDSTCLASGAADGTVVLWNAQSYKLYRCG 
SVKDGSLAACAFSP^SFFVTGSSCGDLTVWDDKMRCZLHSEKAH 
DLGITCCDFSSQPVSDGEQGLQFPRLASCGQDCQVKIWIVSFTH 
ILGFELKYKSTLSGHCAPVLACAFSHDGQMLVSGSVDKSVIVYD 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D=Aspartic Acid, E= 

Glutamic Acid FcPhfnvlalani'np fl—rz 1 ^ rn •( no 

H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine / N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S*Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon , /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TNTENILHTLTQHTRYVTTCAFAPNTLLLATGSMDKTVNIWQFD 
LETLCQARSTEHOLKOFTEDWSPFnVQTWT.r3VOnT.TfnT VrTtrirwi 
NN I DG KELLNLTKES LADDLKI ES LGLRS KVLRKI EELRTKVKS 
LSSGIPDEFICPI TRELMKDPVIASDGYSYEKEAMENWDPAKRN 
RTSPP 


| €438 


109 


901 


EV0ILRAKMF0TGGLIVFYGLLiAnTMAnFnf;7,PVPr.nnTT.Dr nt\;' ~ 
NPALPLSPTGLAGSLTNALSNGLLSGGLLGILENLPLLDILKPG 
GGTS G GLLGGLLG KVTS VI PGLNN 1 1 D I KVTDPQLLELGL VQS P 
DGHRL YVT I PLG I KLQ VNTP LVG AS LLRLAVKLD I TAE I LAVRD 
KQER IHLVLGDCTHSPGS LQ I SLLDGLGPLP I QGLLDSLTG I LN 
KVLPELVQGNVCPLVNEVLRGLDITLVHDIVNMLIHGLQFVIKV 


6439 


23 


412 


SIQTASAITTEMASQSQGIQQLLQAEKRAAEKVADARKRKARRL 
KQAKE E AQME VE Q YRRE REHE FQS KQQAAMGS QGNLS AE VE QAT 
RRQVQGMQSSQQRNRERVLAQLLGMVCDVRPQVHPNYRISA 


6440 


3 


517 


niia ui t rj ul> fvjij v kjjo 1 H±iK 1 u FN uij P Vr i K. V iX^QR FGQNRT 
IKLLTGS S YKVEVKI KPS TLQVENI S IGGVLVPLELKSKEPDGD 
RWYTGTYD TEG VTPTKS GERQP I Q I TM P FTD I GTFETVWQ VKF 
YNYH KRDHCQWG S P FS V I E YE CK PNE TRS LM WVNKES FL 


6441 


234 


1373 


KSGGLRRRQRPGRSAAVGEEELPPGMEKFKAAMLLGSVGDALGY 
RNVC KENS T VGM K I QEE LQ RSGGL DHLVLS PGE W P VS DNT I MH I 

JiTIiT? A T .'I" ITlVTdfT T~inT VT3 ff M\/Dr , virc T^n?VT ncDtinriTomTrn'in 
ttirta/UJi i ux W^ljUUljii\.nmvt\\* X vtil VKlUjlriiKKr'OPATIEGC 

AQLKPNNYLLAWHTPFNEKGSGFGAATKAMCIGLRYWKPERLET 

LIEVSVECGRMTHNHPTGFLGSLCTALFVSFAAQGKPLVQWGRD 

MLRAVPLAEEYCRKTIRHTAEYQEHWFYFEAKWQFYLEERKISK 

DS ENKA I F PDNYDAE ERE KT YRKWS S EGRGGRRG HD APM I A YDA 

GLYQDLEDKEKLEDLGAALYRLSTEEK 


6442 


34 


796 


AEDPAGGLAGQDTMFARGLKRKCVGHEEDVEGALAGLKTVSSYS 
LQRQSLLDMSLVKLQLCHMLVEPNLCRSVLIANTVRQIQEEMTQ 

GHTQG P VSDLCPVTS AQAPRHLQSSAWEMDGPRENRGS FHKSLD 
QIFETLETKNPS CMEELFSDVDSPYYDLDTVLTGMMGGARPGPC 
EGLEG LAPAT PGP S S S CKS DLGELDHWE I L VET 


6443 


2 


555 


MASPAASSVRPPRPKKEPQTLVIPKNAAEEQKLKLERLMKNPDK 
AVPIPEKMSEWAPRPPPEFVRDVMGSSAGAGSGEFHVYRHLRRR 
E Y0RQDYMDAMARKOKLDAE FOXRT .PTfTJK T a aFFfiTl VPP v von 

KLKEKKLIjAKKMKLEQKKQEGPGQPKEQGSSSSAEASGTEEEEE 
VPSFTMGR 


6444 


390 


899 


GS TPRGKMRAP IPEP KPGDL I E I FRPF YRHWAI YVGDGYWHLA 
P PS E VAGAGAAS VMS ALTDKAI VKKELL YD VAGSDKYQVNN KHD 
DKYSPLPCSKIIQRAEELVGQEVLYKLTSENCEHFVNELRYGVA 
RSDQVRDVI IAASVAGMGLAAMSLIGVMFSRNKRQKQ 


6445 


2 


753 


AGAAGAAGAARS PRPQAHTKGVRGLPS rrrs pdcgrmelaags f 
s e eqf weacaelqq p alagadwqllvetsg i s i yrlldkktg l y 
eykvrcvledcsptliadiymdsdyrkqvtoqyvkelyeqecnge 
twywevkyp fpmsnrdyvylrqrrdldmegrkihvi larstsm 
pqlgersgvirvkqykqslaiesdgkkgskvfmyyfdnpggqip 
swlinwaakngvpnflkdmaracqnylkkt 


6446 


1 


1651 


rcptrspppdtpgsrgttamcslasgatggrgaveneedlpels 

dsgdeaawededdadlphgkqqtpclfcnrlftsaebtfshcks 

ehqfnids^khglefygyiklinfirlknptveymnsiynpv 

pwekeeylkpvleddlllqfdvedlyepvsvpfsypkglsents 

weklkhmearalsaeaalararedlqkmkqfaqdfv^tdv^ 

cssstsviadlqededgvyfssyghygiheemlkdkirtesyrd 

f i yqntph i fkd xwl d vg c3tg i ls m f aakag a kkvlg vd qs e t 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 

1 nrs h 4 on 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

to first 
amino acid 
residue of 

ami nn ar^id 

sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
n-nibLiuiiie, j.-ieoieucme, jv.=Liysine , 
L=Leucine, M=Methionine / N=Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T=Threonine, V-Valine f 
nm ity^typnan, i^iyrosine, A= urucnown , w «atOp 
Codon, /=possible nucleotide deletion, 
\*possible nucleotide insertion) 








LYQAMDIIRLNKLEDTITLIKGKlEEVHLPVEKVDVIISEWMGY 
r JjJ-ir noFLuUb VIjiAJ^K.iIjAJ\.(JGSV x PDICTISLVAVSDVNKHA 
DRIAFWDDVYGFKMSCMKKAVIPEAVVEVLDPKTLISEPCGIKH 
IDCHTTSISDLEFSSDFTLKITRTSMCTAIAGYFDIYFEKNCHN 
RWFSTGPQSTKTHWKQTVFLLEKPFSVKAGEALKGKVTVHKNK 
KDPRS LTVTLTLNNS TQTYGLQ 


6447 


1554 


1068 


RLGPAEWHLSGPCHATLGAANRGRALGVRAAWRGAPLCQRVMMP 
SRTNLATG I PSSKVKYSRLSSTDDGYIDLQFKKTPPKI PYKAIA 
LATVLFLIGAFLI I IGSLLLSGYISKGGADRAVPVLI IGILVFL 
PGFYHLRIAYYASKGYRGYSYDDIPDFDD 


j 6448 


74 


559 


GQVLS HC YH YRS S RWRRGGLS RGRGAG VMALVP YE ETTE FGLQK 
FHKP LAT FS FANHT I Q I RQDWRHLG VAAWWDAAI VLS T YLEMG 
AVELRGRSAVELGAGTGLVGIVAALLACRIRYERDNNFLAMLER 
QFIVRKVHYDPEKDVHIYEAQKRNQKEDL 


| 6449 


597 


1876 


EYGVCENLRKLEITGVSCRDVYAKLLHRYRHILGLWQPDIGPYG 
GLLNWVDGLFIIGWMYLPPHDPHVDDPMRFKPLFRIHbMERKA 
AT VE CM YGHKG PHHGH I Q I VKKDE FS TKCNQTDHHRMS GGRQEE 
FRTWLREEWGRTLED I FHEHMQEL I LMKF I YTS Q YDNCLT YRR I 
YLPPSRPDDLIKPGLFKGTYGSHGLEIVMLSFHGRRARGTKITG 
DPN I PAGQQTVE IDLRHR I QLPDLENQRNFNELS RI VLEVRERV 
RQEQQEGGHEAGEGRGRQGPRESQPSPAQPRAEAPSKGPDGTPG 
EDGGEPGDAVAAAEQPAQCGQGQPFVLPVGVSSRNEDYPRTCRM 
Lr lij iUljlA(jHGr^SPERrPGVFILFDEI)RFGFVwLELKSFSLY 
SRVQATFRNADAPSPQAFDEMLKNIQSLTS 


6450 


848 


269 


FVPAPRTVSGKRSLPGEWEERGEGEQRTGREFSGNGGRAVEAAR 
MRLLCGLWLWLSLLKVLQAQTPTPLPLPPPMQSFQGNQFQGEWF 
VLGLAGNSFRPEHRALLNAFTATFELSDDGRFEVWNAMTRGQHC 
DTWS YVL I PAAQ PGQ FT VDHRVWTHEQ AGR PQDQPAGQEL VAAS 
RDAGPVHLPGQSSGPLG 


6451 


232 


939 


HS PT P PTS PRAS TMEDVKLE FPS LPQCKEDAE E WTYPMRREMQE 
I LPGLFLGP YS S AMKS KLP VLQKHG I THI IC I RQNI EANF I KPN 
FQQLFR YLVLD I ADNPVENI I RFFPMTKEFIDGSLQMGGKVLVH 
GN AG I S RS AAF V I AYI MET FGMKYRDAFAYVQERR F C INPNAGF 
VHQLQE YEAI YLAKLT I QMMS PLQ I ERS LS VHSGTTGSLKRTHE 
EEDD FGTMQVATAQNG 


6452 


1 


652 


RTRG E S S NM E P LAA YPLKC S G PRAKV FAVLLS I VL CTVTL FLLQ 
LKFLKPKINS FYAFEVKDAKGRTVS LEK YKGKVS L WNVASDCQ 

T ■ i ■ t itj XT"V"T *n T ITT?T TIV IT CV*^T5 OXJC C\ TT IvTJDr'V'li^cvnc'cr'nnnc'vmron 

i_i j. i LAjLtivciljiil\-&f oxir toVLiAr JrUNyr uisbrilrKrSKBVES 
FARKNYGVTFPIFHKIKILGSEGEPAFRFLVDSSKKEPRWNFWK 
YLVNPEGQ WKFWRPEEP I E VI RPD IAALVRQ V 1 1 KKKEDL 


6453 


827 


223 


HRRKLPGLSMSPRRTLPRPLSbCLSLCLCLCLiAAAIiGSAQSGSC 
RDKKNCKVVFSQQELRKRLTPIjQYHVTQEKGTESAFEGEYTHHK 
DPGIYKCWCGTPLFKSETKFDSGSGWPSFHDVINSEAITFTDD 

fsygmhrvetscsqcgahlghifddgprptgkrycinsaalsft 
padssgtaeggsgvas paqadkael 


6454 


827 


223 


hrrwlpglsmsprrtlprplslclslclclclaaalgsaqsgsc 
rdkknckvvfsqqelirkrltplqyhvtqekgtesafegeythhk 
dpg i ykc wcgtplfks etkfdsgsg w ps fhdv inseai tftdd 

FS YGMHRVETSCSQCGAHLGH I FDDGPRPTGKRYCINS AALS FT 
PADSSGTAEGGSGVAS PAQADKAEL 


6455 


1042 


173 


RVHLATVSAS AAWDALGLPVRSHMQGSTRRMGVMTDVHRRFLQL " 
LMTHGVLEEWD\^KRLQTHCYKVHDRNATVDKLEDFINNINSVLE 
SLYIEI KRG VTEDDGRP I YALVNLATTS I S KMATDFAENELDLF 
RKALELIIDSETGFASSTNILNLVDQLKGKKMRKKEAEQVLQKF 
VQNKWLIEKEGEFTLHGRAILEMEQYIRETYPDAVK1CNICHSL 



506 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
[A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T«Threonine, V-Valine, 
W«Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LIQGQSCETCGIRMHLPCVAKYFQSNAEPRCPHCNDYWPHEIPK 
VFDPEKERESGVLKSNKKSLRSRQH 


6456 


2 


555 


R PQS RS I S MWRNS LLQ VS SGLR WLRVCAM VD I LGERHLVTCKGA 
TVEAEAALQNKWALYFAAARCAPSRDFTPLLCDFYTALVAEAR 
RPAPFEWFVSADGSSQEMLDFMRELHGAWLALPFHDPYRHELR 
KRYNVTAI PKLVI VKQNGEVITNKGRKQIRERGLACFQDWVEAA 
DIFQNFSV 


6457 


23 


892 


PTTGFPVTNFPWNWPDGKPPiMlLYVSKLNKIIHFPDFDKKIPV 
)\Ltt lrijFljJjXVGNHISGbSSTSKijSLPMFTVLRKFTIPLTLLLET 
1 1 LGKQYSLNI I LSVFAI I LGAFIAAGSDLAFNLEGYI FVFLND 
I FT AANGVYT KQ KMD P KE LGKYG VL F YNAC FM 1 1 PTL 1 1 S VS TG 
DLQQATE FNQW KNWF I LQ FLLS CFLGFLLM YS TVLCS Y YNSAL 
TTAWGAI KNVS VAYIGILIGGDYIFSLLNFVGLNICMAGGLRY 
SFLTLSSQLKPKPVGEENICLDLKS 


6458 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKIIHFPDFDKKIPV 
rU-ir VLtPULiY VGNHIbCsliSoTSKLSLPMFTVLRKFTIPLTLLLET 
1 1 LGKQYSLNI I LSVFAI I LGAFIAAGSDLAFNLEGYI FVFLND 
IFTAANGVYTKQKMDPKELGKYGVLFYNACFMIIPTLIISVSTG 
DLQQATEFNQWKNWFI LQFLLS CFLGFLLM YS TVLCS YYNSAL 
TTAWGAI KNVS VAY IGILIGGDYIFS LLN FVGLNI CMAGG LR Y 
SFLTLSSQLKPKPVGEENICLDLKS 


6459 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKIIHFPDFDKKIPV 
rujr yxjk'liU x V(jJNHls>CjijbS I SKLSLPMb 1 vLRKFTIPLTLi-iLET 
1 1 LGKQYSLNI I LSVFAI I LGAFIAAGSDLAFNLEGYI FVFLND 
IFTAANGVYTKQKMDPKELGKYGVLFYNACFMI IPTLI ISVSTG 
DLQX^ATEf^QWKhn/VFILQ FLLS CFIiGFIJjMYSTVLC^Y YNSAL 
TTAWGAI KNVS VAY I G I L IGGDY I FS LLNF VGLN I CMAGG LRY 
SFLTLSSQLKPKPVGEENICLDLKS 


6460 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKIIHFPDFDKKIPV 
KL FP L PLLYVGNH I SGLS STS KLS LPMFTVLRKFT I PLTLLLET 
I ILGKQYSLNI ILSVFAI I LGAFIAAGSDLAFNLEGYI FVFLND 
I FTAANG VY T KQ KMD PKELG KYGVLFYNACFM 1 1 PTLI ISVSTG 
UJjWOAI h t NQW KJfVVFI LQFLLSCFLGFLLMYS TVLCS Y YNSAL 
TTAWGAI KNVS VAYIG I LIGGDYI FSLLNFVGLNI CMAGG LRY 
SFLTLSSQLKPKPVGEENICLDLKS 


6461 


1653 


360 


LQQRTLR I TAVGQTHP I AWMAWEPSLGAFYGPAS FI TFVNCMYF 
LS I FIQLKRHPERKYELKEPTEEQQRLAANENGE INHQDSMSLS 
LISTSALENEHTFHSQLLGASLTLLLYVALWMFGALAVSLYYPL 
jjuvr or v rvjftiDusroftf r v vriilL.vriKrjL)vr<J-AW irll CLPQKSS 

YS VQVNVQPPNSNGTNGEAPKCPNS SAESSCTNKS AS S FKNS S Q 
GCKLTNLQAAAAQ CHANS LPLNSTPQLDNS LTEH S MDND I KMHV 
APLEVO FRTNVHS SRHHKNRS KGHR A 9 R T /TUT .R V ya vm/DTQ vi? 
GSVQNGLPKSRLGNNEGHSRSRRAYLAYRERQYNPPQQDSSDAC 
STLPKS S RNFEKP VSTTS KKDALRKPAVVELENQQKS YGLNLAI 
QNGP I KSNGQEGPLLGTDSTGNVRTGLWKHETTV 


6462 


3 


773 


SEELDREKKLKEDSPRKTPNKESGVPSLPVSLTS I KEEPKEAKH 
PDSQSMEESKLKNDDRKTPVNWKDSRGTRVAVSSPMSQHQSYIQ 
YLHAYP YPQM YD P S H PAYRAVS P VLMHSYPGAYLSPGFHYPVYG 
KMSGREETEKVNTSPSVNTKTTTESKALDLLQQHANQYRSKSPA 
P VEKATAEREREAERERDRHS P FGQRHLHTHHHTHVGMGY PL I P 
G Q YD P FQG LTSAALVASQQ VAAQ AS AS GM FPGQRRE 


6463 


2 1 


350 


VI LC I LGG W I FKNADRSMBKKKGEPRTRAEARP WVDEDLKDSS D 
LHQAEEDADEWQESEENVEH I PFSHNHYPEKEMVKRSQEFYELL 
NKRRS VRF ISNEQVPME VI DNV I RTAGL 


6464 


12 


1154 | GILRQKEREERNRIHKKEILFLEHLLWPSEMSSLSGKV(iTVt6 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L*Leucine, M=Methionine, N=Asparagine, 
psproline, Q=Glut amine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
WsTryptophan, Y=Tyrosine, X«Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\*possible nucleotide insertion) 








LVEPSKLGRTLTHEHLAMTFDCCyCPPPPCQEAISKEPIVMKNL 
yWIQKNAYSHKENLQLNQETEAIKEELLYFKANGGGALVENTTT 
G I SRDTQTLKRLABETGVHI I SGAGFYVDATHSSETRAMS VEQL 
TDVLMNE I LHGADGTS IKCGI IGE IGCSWPLTESERKVLQATAH 

AQAQLGCPVIIHPGRSSRAPPQIIRILQEAGADISKTVMSHLDR 

TTT.DtCTTFTJ .F.FZVnTi^PVT.FVnT.OT3TT7T,T.lJvnT fiDDTriMnnnxTini 
* ui/ivwjuuDmyiAJV< I UDluur v3 X "''"in 1 \J JLA7.tr U X Url ir JJUH l\xC 

I RRVRLLVEEGCEDRI LVAHD IHTKTRLMKYGGHGYSH I LTNW 
PKMLLRG I TENVLDKI LIENPKQWLTFK 


6465 


126 


1396 


run vrr ivj nt\ririri fv.iv 1 i /iojjv. IjIj i n^yjrrjwjjltjiJSJttvJlJ^JUJjKKyUlC 
QEAQVFGNQLIPPNAQVKKATVFXjNPAACKGKARTLFEKNAAPI 
LHLSGMDVTIVKTDYEGQAKKLLELMENTDVI IVAGGDGTLQEV 
VTGVLRRTDEATFS KIP IGFI PLGETSSLSHTLFAESGNKVQHI 
TDATLAI VKGETVPLD VLQ IKGEKEQ PVFAMTGLRWGS FRDAGV 
KVSKYWYLEPLKIKAAHFFSTLKEWPQTHQASISYTGPTERPPN 
EPEETPVQRPSLYRRILRRLASYWAQPQIIALSQEVSPEVWKDVQ 
LSTIELSITTRNNQLDPTSKEDFLNICIEPDTISKGDFITIGSR 
KVRNP FCLHVEGTECLQASQCTLLI PEGAGGSFS IDSEEYEAMPV 
EVKLLPRKLQFFCDPRKREQMLTSPTQ 


6466 


1134 


828 


VARGTELSQLEKAHPPADMGRRKSKRKPPPKKKMTGTLETQFTC 
P FCNHE KS CD VKMDRARNTG V I S CTVCLEE FQT P I TYLS E P VD V 
YSDW I DACEAANQ 


6467 


301 


2571 


GELR VLALAHGELACHAVIiTAS LLS LRSRLMDS DMD YE R PNVET 
I KCVWGDNAVG KTRL I CARACNATLTQYQLLATHVPTVWAIDQ 
YRVCQEVLERSRDWDDVSVSLRLWDTFGDHHKDRRFAYGRSDV 
WLCFS I ANPNSLHHVKTMWYPE I KHFCPRAP V I LVGCQLDLRY 
ADLEAVNRARRPLARPIKPNEILPPEKGREVAKELGIPYYETSV 

VAUrljlruJVr DJXAlKA/UjloKKHLtUr WKSHLRNVQRPLLQAPFL 

PPKPPPPIIWPDPPSSSEECPAHLLEDPLCADVILVLQERVRI 
FAHKIYLSTSSSKFYDLFLMDLSEGELGGPSEPGGTHPEDHQGH 
SDQHHHHHHffflHGRDFLLRAASFDVCESVDEAGGSGPAGLRAST 
SDGI LRGNGTG YLPGRGRVLSSWSRAFVS IQE EMAEDPLTYKSR 
LMVWKMDSS I QPGPFRAVLKYLYTGELDENERDLMHIAHIAEL 
LEVFDLRMKVANI LNNEAFMNQEI TKAFHVRRTNR VKECLAKGT 
FSDVTFILDDGTISAHKPLLISSCDWMAAMFGGPFVESSTREW 
FPYTS KS CMRAVLE YLYTGMFTSSPDLDDMKL I ILANRLCL PHL 
VALTEQ YTVTGLMEATQMMVD I EX3DVLVFLELAQFHCAYQLADW 
CLHH I CTNYNNVCRKF PRDMKAMS P ENQE YFE KHRW P P VW YLKE 
EDHYQRARKEREKEDYLHIjKRQPKRRWLFWNS PSS PS SSAAS ss 
SPSSSSAW 


6468 


3 


1374 


DAWAGTNMAALAP VGS PAS RG PRLAAGLRLLP MLGLLQLLAE PG 
USRVHHIJ^KDDVRHKVHIOTFGFFKDGYMVVNVSSLSLNEPED 
KD VT I G F S LDRTKNDGFSS YLDEDVN YC I LKKQS VS VTLLILD I 
SRSEVRVKSPPEAGTQLPKIIFSRDEKVLGQSQEPNVNPASAGN 
QTQKTQDGGKSKRSTVDSKAMGEKSFSVHNNGGAVSFOFFFNIS 
TDDQEGLYSLYFHKCLGKELPSDKFTFSLDIEITEKNPDSYLSA 
GEIPLPKLYISMAFFFFLSGTIWIHILRKRRKTDVFKIHWLMAAL 
PFTKSLSLVFKAIDYHYISSQGFPIEGWAWYYITHLLKGALLF 
ITIALIGTGWAFIKHILSDKDKKIFMIVIPRRVLANVAYIIIES 
TEEGTTEYGLWKDSLFLVDLLCCGAILFPWWSIRHLQEASATD 
GKGKFSRAHFVLLSLL 


6469 


3 


1374 


DAWAGTNMAALAPVGS PASRGPRIiAAGLRLLPMLGLLQLLAEPG 
LGRVHHlALKDDVRHKVHLNTFGFFKDGYMVVNVSSLSIiNEPED 
KDVTIGFSLDRTKNDGFSSYLDEDVNYCILKKQSVSVTLLILDI 
SRSEVRVKSPPEAGTQLPKIIFSRDEKVLGQSQEPNVNPASAGN 
QTQKTQDGGKSKRSTVDSKAMGEKSFSVHNNGGAVSFQFFFNIS 
TDDQEGLYSLYFHKCLGKELPSDKFTFSLDIEITEKNPDSYLSA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine , G*Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q*Glutamine, RsArginine, 
S=Serine, T*Threonine, V«Valine, 
Wss Tryptophan, Y«Tyrosine, X-Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GEIPLPKLYIS^FFFFLSGTIWIHILRKRRNDVFKIHWLMAAL 
PFTKSLSLVFHAIDYHYISSQGFPIEGWAWYYITHLLKGALLF 
ITIALIGTGWAFIKHILSDKDKKIFMIVIPRRVLANVAYIIIES 
TEEGTT E YGLWKDS L FL VDLLCCGAI LF P WWS I RH LQE AS ATD 
GKGKFSRAHFVLLSLL 


6470 


2726 


1437 


AAASGVSSRADAPVLAQS PAS AGNGRPSTPRVPGSRRHPS APRS 
GPLPREDGCRTPGPQLLPLPGALLRPRTLLSSAAETGRSRHPDT 
QHPSSGGRCRGGTESPSSAAGRPASMAEAEEDCHSDTVRADDDE 
ENESPAETDLQAQLQMFRAQWMFELAPGVSSSNLENRPCRAARG 
SLQKTSADTKGKQEQAKEEKARELFLKAVEEEQNGALYEAIKFY 
RRAMQLVPD IEFKI TYTRSPDGDG VGNS YI EDNDDDS KMADLLS 
YFQQQLTFQESVLKLCQPELESSQIHISVLPMEVLMYIFRWWS 
S DLDLRS L EQLS L VCRG F YI CARD PE I WRLACL KVWG RS C I KL V 
P YTS WREM FLERPRVRFDG VY I S KTT YI RQGEQS LDGFYRAWHQ 
VEYYRYIRFFPDGHVMMLTTPEEPQSIVPRLRTR 


6471 


1750 


299 


FFFDKMAAGGSGVGGKRS S KSDADSG FLGLRPTS VD PALRRRRR 
G P RNKKRG WRRLACE P LG L E VDQ FLED VRLQERTS GG LLS EAPN 
EKLFFVDTGSKEKGLTKKRTKVQKKSLLLKKPLRVDLILENTSK 
VPAPKX)VLAHQVPNAKKLRRKEQLWEi<LAKQGELPREVRRAQAR 
LLNP S ATRAKPG PQDTVER P FYD LWAS DNP LDR P LVGQDE F FLE 
QTKKXGVKRPARLHTKPSQAPAVEVAPAGASYNPSFEDHQTLLS 
AAHEVEIiQRQKEAEKLERQLALPATEQAATQESTFQELCEGLLE 
ESDGEGEPGQGEGPEAGDAEVCPTPARLATTEKKTEQQRRREKA 
VKRLRVQQAALRAARLRHQELFRLRG I KAQVALRLAELARRQRR 
RQARREAEADKPRRLGRLKYQAPD I DVQLSS ELTDS LRTLKPEG 
NI LRDR FKS FQRRNM I EPRERAKFKRKYKVKLVEKRAFREIQL 


6472 


3 


897 


SCGSDRAQWAMEFPFDVDALFPERITVLDQHLRPPARRPGTTTP 
ARVDLQQQ IMT 1 1 DELG KAS AKAQNLS AP I TS AS RMQSNRH W Y 
I LKDS S ARPAGKGAI IG F I KVGYKKLFVLDDREAHNEVE P LC I L 
DFYI HES VQRHGHGRELFQYMIiQKERVE PHQLAIDRPSQKLLKF 
LNKH YNLETTVPQVNNFVI FEGFFAHQHRPPAPS LRATRHSRAA 
AVDPT PAAPARKLP PKRAEGD I KP YS SS DREPLKVAVEP PWPLN 
RAPRRATPPAHPPPRSSSLGNSPERGPLRPFVP 


6473 


22 


912 


SSAVE FVWEGE KMAAEPNKTE I QTLF KRLRAVPTNKACFDCGAK 
NPSWASITYGVFLCIDCSGVHRSLGVHLSFIRSTELDSNWNWFQ 
LRCMQVGGNANATAFFRQHGCTANDANTKYNSRAAQMYREKIRQ 
LGS AALARHGTD LW I DNMS S AVPNHS P E KKDS D F FTE HTQP PAW 
DAPATEPSGTQQPAPSTESSGLAQPEHGPNTDLLGTSPKASLEL 
KS S 1 1 G KKKPAAAKKGLG AKKGLGAQKVS SQSFSEIE RQAQ VAE 
KLREOQAADAKKOAEESMVASMRLAYQELQIDR 


6474 


3 


46"2 


LQRQRQHPAAAPAVP VRCFTFC FTDI VI M PKRKS PENTEGKDGS 
KVTKQE PTRRS ARLSAKPAP P KP EPKPR KTS AKKEPGAKI SRGA 
KGKKEEKQEAGKEGTAPSENGETKAEE IH I SRSTVNVSTSRGTP 
PSTJjo VKGQlfci rvRVKljl EN 


6475 


3 


462 


LQRQRQHPAAAPAVPVRCFTFCFTDIVIMPKRKSPENTEGKDGS 
KVTKQE PTRRS ARLSAKPAP P KPEPKPRKTS AKKEPGAKI S RGA 
KGKKEEKQSAGKEGTAPSENGETKAEEIHISRSTVNVSTSRGTP 
PS TLS VKGO IET VR VKGTEN 


647£ 


106 


1090 


ARAMAQYKGTMREAGRAMHLLKKRERQREQMSVLKQRIAEETIL 
KSQVDKRFSAHYDAVEAELKS S TVGLVTLNDMKARQEALVRERE 
RQLAKRQHLEEQRLQQERQR BQEQRRERKRKI S CLS FALDDLDD 
QADAAEARRAGNLGKNPDVDTSFLPDRDREEEENRLREELRQEW 
EAQRE KVKDEEMEVT F S YVJIX5S GHRRTVR VRKGNTVQQ FLKKAL 
QGLRKDFLELRSAGVEQLMFIKEDLILPHYHTFYDFIIARARGK 
SGPLFS FDVHDDVRLLSDATMEKDESHAGKVVLRSWYEKNKHI F 
PASRWEAYDPEKKWDKYTIR 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
locat ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, Es 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Hist idine , I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y>= Tyrosine, X«Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6477 


227 


915 


LQGHLMG I MAAS RP LS RFWEWGKN I VCVGRNYADH VREMRSAVL 
SEPVLFLKPSTAYAPEGSPILMPAYTRNLHHELELGWMGKRCR 
AVPE AAAMDYVGGYALCLiDMTARDVQDECKKKGLPWTLAKS FTA 
SCPVSAFVPKEKIPDPHKLKLWLKVNGELRQEGETSSMIFSIPY 
I ISYVSKI I7LEEGDI I LTGTP KG VGPVKENDE I EAG IHGLVSM 

I r A. V Hil\f C 1 


6478 


2 


1495 


FVSSRILPES LAS S EAS TLEAMGRKEEDDCS S W KKQTTN I RKT F 
I FMEVLGSGAFSE VFLVKQRLTGKLFALKCI KKS PAFRDS S LEN 
E I AVLKK I KHENIVTLEDI YESTTH YYL VMQL VS GGE L FDR I LE 
RGVYTEKDAS LVI QQVLSAVKYLHENGI VHRDLKPENLLYLTPE 
ENS X IMITDFGLS KMEQNG IMSTACGTPGYVAPEVLAQKP YS KA 
VDCWS IGVI TYI LLCGYPPFYEETES KLFEKI KEGYYEFES PFW 
DD I S ESAKDF I CHLLE KDPNERYTCEKALS HPW I DGNTALHRDI 

VDCVOT riTOVMT?3i QVUlsnTiCKIftfc VT.UMUT.WQPfSYTPD 

i ro v oJj(J J-ijlviN r i\i\i3i\Yi K\iAc Wflftnv VririnKi\jjnruNijrioi'Vj vi\±* 
EVENRPPETQASETSRPSSPEITITEAPVLDHSVALPALTQLPC 
QHGRRPTAPGGRSLNCLVNGSLHISSSLVPMHQGSLAAGPCGCC 

O O V_ J_)iM i. uDJVVjivO O X V^^jj Ci tr X -LJi jrvxvr\Ai x\.xv^lv 17 Ran v ft v tr v zuwvjjo 

HCRAGQTGVCLIM 


6479 


3 


949 


SCRGPGWHPAGGQAGAMELLSALSLGELALSFSRVPLFPVFDLS 
YF I VS I L YL KYE PGAVE LSRRHP IAS WLCAMLHC FGS Y I LAD LL 
LGE PL I DYFSNNSS I LLASAVWYL I FFCPLDLF YKCVCFLP VKL 
I FVAM KE WRVRK I AVG I HHAHHHYHHG WFVM I ATGWVKGS GVA 
LMSNFE QLLRG VW KP ETNE I LHMS FPTKAS LYGA I L FTLQQTRW 
LPVSKASLIFIFTLFMVSCKVFLTATHSHSSPFDALEGYICPVL 
Fn<? Art^nHHKnNHGR<^H<?GGGPGAQHS AMPAKS KEELSEGS RK 
KKAKKAD 


6480 


192 


514 


DFMSIYFPIHCPDYLRSAKMTEVMMNTQPMEEIGLSPRKDGLSY 
QIFPDPSDFDRCCKLKDRLPSIWEPTEGEVESGELRWPPEEFL 
VQEDEQDNCEETAKENKEQ 


6481 


110 


1131 


KSRMDLDWNMFVIAGGTLAIPILAFVASFLLWPSALIRIYYWY 
WRRTLGMQVRYVHHEDYQFCYSFRGRPGHKPSILMLHGFSAHKD 
MWLS WKFL PKNLHLVCVDMPGHEGTTRS S LDDLS IDGQVKR I H 
QFVECLKLNKKPFHLVGTSMGGQVAGVYAAYYPSDVSSLWLVCP 
AGLQ YSTDNQFVQRLKELOGSAAVEKI PLI PSTPEEMSEMLQLC 
S YVRF KVPQQ I LQG L VD VR I PHNNFYRKLFLE I VS EKS R YS LHQ 
NMDKI KVPTQ 1 1 WG KQDQVLDVSGADMLAKS IANCQVE LLENCG 
HS WMERPRKTAKLI IDFLASVHNTDNNKKLD 


6482 


2517 


568 


EPVSKVSQSRRKAGVPTANIEESQAVEAAMANVPWAEVCEKFQA 
ALALS R VE LH KNP E KE P YKS KYS ARALLEEVKALLG PAPEDEDE 
p p p af tv; pn jvnnH at a t . paf wf P FC3 p V AORAVRLAV I e fhlgv 
NHI DTEELS AGEEHLVKCLRLLRRYRLSHDCI SLC IQAQNNLG I 
LWS EREE I ETAQAYLES S EALYNQYMKE VGS P PLDPTERFLPEE 
EKLTEQERS KRFEKVYTHNLYYLAQVYQHLEMFEKAAHYCHSTL 
KRQLEHNAYHPIEWAINAATLSQFYINKLCFMEARHCLSAANVI 
FGQTGKISATEDTPEAEGEVPELYHQRKGEIARCWIKYCLTLMQ 
NAQLSMQDN I GELDLDKQSELRALRKKELDEEES IRKKAVQFGT 
GEL CDA I SAVE E KVS YLRPLDFEE ARELFLLGQHYVFEAKE FFQ 
I DG YVT DH I E WQDHS ALFKGLAFFETDMERR C KMHKRR IAMLE 
PLTVDliNPQYYLLVNRQIQFEIAHAYYDMMDLKVAIADRLRDPD 
SHIVKKINNLNKSALKYYQLFLDSLRDPNKVFPEHIGEDVLRPA 
MLAKFR VARL YGKI I TAD P KKE LENLATS LE H Y KF I VD Y CE KHP 
EAAQE I E VELELS KEMVS LLPTKMERFRTKMALT 


6483 


3 


623 


NSHLLCGLRARAPL5ANGREARAMEQRLAEFRAARKRAGLAAQP 
P AAS QGAQT PGE KAE AAATLKAAPGWLKRFLVW KPR P AS ARAQ P 
GLVQEAAQ PQGSTS ETPWNTAI PLPSCWDQS FLTNITFLKVLLW 
L VLLG LFVE LE FGLA Y F VLS LF YWMYVGTRG PEE KKEG E KS AYS 
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VFNPGCEAIQGTLTAEQLEREIjQLRPIiAGR i 


6484 


201 


965 


QLAVKTKMSGLRPGTQVDPEIELFVKAGSDGESIGNCPFCQRLF 
MILWLKGVKFNVTTVDMTRKPEELKDLAPGTNPPFLVYNKELKT 
DFIKIEEFLEQTLAPPRYPHLSPKYKESFDVGCNLFAKFSAYIK 
NTQKEANKNFEKSLLKEFKRLDDYLNTPLLDEIDPDSAEEPPVS 
RRLFLDGDQLTLADCSLLPKLNI IKVAAKKYRDFDI PAEFSGVW 
RYLHNAYAREEPTHTCPEDKE IENTYANVAKQKS 


6485 


6 


1031 


FVDLVRAVEFLPCPDSQKLEKECQSSEBSMGSNSMRSILEEDEE 
DEE P PRVLLYHEPRS FE VGMLVWHKHKKYPFWPAWKS VRQRDK 
KAS VLYI EGHMNPKMKG FTVSLKS LKHFDCKEKQTLLNQARED F 
NQDIGWCVSLITDYRVRLGOGSFAGSFLEYYAADISYPVRKSIQ 
QDVLGTKLPQLSKGSPEEPWGCPLGQRQPCRKMLPDRSRAARD 
RANQKLVEYIGKAKGAESHLRAILKSRKPSRWLQTFLSSSQYVT 
CVETYLEDEGQLDLWKYLQGVYQEVGAKVLQRTNGDRIRFILD 
VLLPEAI I CAI S AGDE VDYKTAEEKYI KG PSLS YRE KE I FDNQL 
LEERNRRRR 


6486 


10 


581 


LVLQAGGAHLS PSRVTQG I YYMLAFS EMPKP PDYSELSDS LTLA 
GGTGRFSGPLHRAWRMMNFRQRMGW I GVGL YLLASAAAFYYVFE 
I S ETYNRLALEH I QQJH P E E P LEGTTWTHS LKAQLLS LP F W VWTV 
I FL VP YLQM FLFL YS CTRAD P KTVG Y C 1 1 P I CLAV I CNRHQAFV 
KASNQISRLQLIDT 


64B7 


352 


863 


SFIiKPLRGKMSVTLHTDVGDIKIEVFCERTPKTCENFLALCASN 
YYNGC I FHRNI KGFMVQTGDPTGTGRGGNS I WGKKFEDE YS E YL 
KHNVRGWSMANNGPNTNGSQFFITYGKQPHLDMKYTVFGKVID 
GLETLDELEKLPVNEKTYRPLNDVHI KDITIHANPFAQ 


6488 


878 


241 


TALQEFGTSGPPLSLRFALPSGTGRFKPLPGARGPSWPPSPRVP 
MEPPNLYPVKLYVYDLSKGIiARRLSPIMLGKQLEGIWHTSIVVH 
KDEFFFGSGGISSCPPGGTLLGPPDSVVDVGSTEVTEEIFLEYL 
SSLGESLFRGEAYNLFEHNCNTFSNEVAQFLTGRKIPSYITDLP 
SE VLSTPFGQALRPLLDS IQIQP PGGS SVGRPNGQS 


6489 


1457 


375 


KVAKMATALS EEELDNEDY YS LLNVRREASSEEL KAAYRRLCML 
YH PD KHRDPE LKS QAERL FNL VHQAYEVLSDPQTRAI YD I YG KR 
GLEMEGWEWERRRTPAEIREEFERLQREREERRLQQRTNPKGT 
I S VG VDATDLFDRYDEEYEDVS GSSFPQIE INKMH I SQS I EAPL 
TATDTAI LSGSLSTQNGNGGGS INFALRRVTSAKGWGELEFGAG 
DLQGPLFGLKLFRNLTPRCFVTTNCALQFSSRG I R PGLTTVLAR 
NLDKNTVGYLQWHCS S PLLQVQRPHRNTRACAPE PS FRP FLHVP 
TWDAECSGARTPSTAWTSAAVKLREACLSGPGSGSHQLLLLTPR 
SKRRTGGG 


6490 


3 


1183 


HEAGCEVWLGYGPRAAAAAAATVLFGGAGPTETMFVARS XAADH 
KDL I HD VS FD FHGRRMATCS S DQS VKVWDKS E SGD WHCTAS WKT 
HSGSVWRVTWAHPEFGQVLAS CSFDRTAAVWEE I VGESNDKLRG 
QS H WVKRTTL VDS RTS VTD VKFAPKHMGLMLATC S ADGI VR I YE 
APDVMNLSQWSLQHEISCKLSCSCISWNPSSSRAHSPMIAVGSD 
DSSPNAMAKVQIFEYNENTRKYAKAETLMTVTDPVHDIAFAPNL 
GRSFHILAIATKDVRIFTLKPVRKELTSSGGPTKFEIHIVAQFD 
NHNS QWR VSVWI TGTVLAS SGDDGCVRLWKANYMDNWKCTG I h 
KGNGSPVNGSSQOGTSNPSIiGSNIPSLQNSLNGSSAGRKHS 


6491 


3 


1183 


HEAGCE VWLGYG PRAAAAAAATVLFGGAGPTETMFVARS IAADH 
KD LTHD VS FD FHGRRMATCS S DQS VKVW D KS ES GD WH CTAS W KT 
HSGSVWRVTWAHPEFGQVLAS CS FDRTAAVWEE I VGESNDKLRG 
QSHWVKRTTLVDSRTSVTDVKFAPKHMGLMLATCSADGIVRIYE 
APDVMNLSQWSLQHErSCKLSCSCISWNPSSSRAHSPMIAVGSD 
DSSPNAMAKVQIFEYNENTRKYAKAETLMTVTDPVHDIAFAPNL 
GRSFHILAIATKDVRIFTLKPVRKELTSSGGPTKFEIHIVAQFD 
NHNSQVWRVS WNI TGTVLAS SGDDGCVRLWKANYMDNWKCTG I L 
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KGNGS PVNGSSQQGTSNPSLGSNI PSLQNSLNGS SAGRKHS 


6492 


34 


2573 


IPFLKSCCCCCLFDFPPPPLDQVQEEECEVERVTEHGTPKPFRK 
FDSVAFGESQSEDEQFENDLETDPPNWQQLVSREVLLGLKPCEI 
KRQE VINELF YTERAHVRTLKVLDQVFYQRVSREGILS PS ELRK 
IFSNLEDILQLHXGLNEQMKAVRKRNETSVIDQIGEDLLTWFSG 
PGEE KLKHAAATFCSNQPFALEM I KSRQKKDS RFQTFVQDAESN 
P LCRRLQLKD 1 1 P TQMQRLTKYP LLLDN I ATYTE W P TE RE KVKK 
AADHCRQILNYVNQAVKEAENKQRLEDYQRRLDTSSLKLSEYPN 
VEELRNLDLTKRKMIHEGPLVWKVNRDKTIDLYTLLLEDILVLL 
QKQDDRLVLRCHSKIIASTADSKHTFSPVIKLSTVLVRQVATDN 
KAL FV I SMS DNGAQ I YE L VAQTVS E KT VWQD L I CRMAAS VKEQS 
TKPIPLPQSTPGEGDNDEEDPSKLKEEQHGISVTGLQSPDRDLG 
LESTLISSKPQSHSLSTSGKS E VRDLFVAERQ FAXEQHTDGTLK 
EVGEDYQIAIPDSHLPVSEERWALDALRNLGLLKQLIiVQQLGLT 
EKSVQEDWQHFPRYRTASQGPQTDSVIQNSENIKAYHSGEGHMP 
FRTGTGDIATCYSPRTSTESFAPRDSVGLAPQDSQASNILVMDH 
MIMTPEMPTMEPEGGLDDSGEHFFDAREAHSDENPSEGDGAVNK 
EEKDVNLRISGNYLILDGYDPVQESSTDEEVASSLTLQPMTGIP 
AVESTHQQQHSPQNTHSDGAISPFTPEFLVCX3RWGAMEYSCFEI 
QSPSSCADSQSQIMEYIHKIEADLEHLKKVEESYTILCQRLAGS 
ALTDKHSDKS 


6493 


557 


1147 


TPARMAYQGSSTSDCMSKTLDSASAHFAASAWSAPVPSRSEVA 
KEQNTGHNNINGWQPSGTSKTLYSTNMALSSSPGISAVQLVRT 
VGHTTTNHL I PALCTSSPQTLPMNNS CLTNAVHLNNVS WS P VN 
VHINTRTSAPSPTALKLATVAASMDRVPKVTPSSAISSIARENH 
E PERLG LNG I AE TTVAME VT 


6494 


2425 


1052 


AVAGGARPCSTPSSPHRRCRRHRPRPLPRPPAAIMSASAVYVLD - 
LKGKVLICRNYRGDVDMSEVEHFMPILMEXEEEGMLSPILAHGG 
VRFMWIKHNNLYLVATSKKNACVSLVFSFLYKVVQVFSEYFKEL 
EEES IRDNFVI I YELLDELMDFGYPQTTDSKILQEYITQEGHKL 
ETGAPRP PATVTNAVS WRSEG I KYRKNEVFLDVI ES VNLLVSAN 
GNVLRSE I VGS I KMRVFLSGMPELRLGLNDKVLFDNTGRGKS KS 
VELED VKFHQCVRLSRFENDRT I S FI PPDGE FELMS YRLNTHVK 
PLIWIESVIEKHSHSRIEYMIKAKSQFKRRSTANNVEIHIPVPN 
DADSPKFKTTVGSVKWVPENSE I VWS IKSFPGGKEYLMRAHFGL 
PS VEAEDKEGKPP IS VKFE I P YFTTSGI QVRYLKI I EKSGYQAL 
PWVRYITQNGDYQLRTQ 


6495 


2425 


1052 


AVAGGARPCSTPSSPHRRCRRHRPRPLPRPPAAIMSASAVYVLD 
LKGKVLICRNYRGDVDMSEVEHFMPILMEKEEEGMLSPILAHGG 
VRFMW I KHNNLY L VATS KKNACVSL VPS PL YXWQVFS E YFKEL 
EEESIRDNFVIIYELLDELMDFGYPQTTDSKILQEYITQEGHKL 
ETGAPRP PATVTNAVS WRSEG I KYR KNE VFLD VI ES VNLLVSAN 
GNVLRS E I VGS I KMRVFLSGMPELRLGLNDKVLFDNTGRGKS KS 

VELEDVKFHOCVRTj^RFFMnRTT^PTPPTY^FFFT JWQVPT KTTUW 
vcitiDuvjvf nyuvixDour CjIn ui\ i x or x tr rlAj.D r Ctix'lo I KJ-il\ l ri v t\. 

PL I WI ES VI EKHSHS R I EYMI KAKS QFKRRS TANNVE IHI PVPN 

DADSPKFKTTVGS VKWVPENSE I VWS IKSFPGGKEYLMRAHFGL 

PSVEAEDKEGKPPISVKFEIPYFTTSGIQVRYLKIIEKSGYQAL 

PWVRYITQNGDYQLRTQ 


6496 


247 


559 


LRAVSLLPLQLVLPEYSIHSLFCIMFLCAQEWLTLGLNVPLLFY 
HFWR YFHCP ADS SELAYDP PWMNADTLS YCQKEAWCKLAFYLL 
S FF YYL YCM I YTLVSS 


6497 


1053 


352 


ANTQICRLCPRRHLHPPCGAKMGNGTEEDYNFVFKWLIGESGV 
GKTNLLSRFTRNEFSHDSRTTI GVEFS TRTVMLGTAAVKAQ I WD 
TAGLERYRAITSAYYRGAVGALLVFDLTKHQTYAWERWLKELY 
DHAEATIWMLVGNKSDLSQAREVPTEEARMFAENNGLLFLETS 
ALD S TNVE LAF E TVLKE I FAKVS KQRQNS I RTNA I TLGSAQAGQ 
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EPGPGEKRACCISL 


6498 


2636 


272 


S LRLCPWGTHLAGPTTMRLSSLLALLRPALP L I LGLS LGCS LS L 
LRVSWIQGEGEDPCVEAVGERGGPQNPDSRARLDQSDEDFKPRI 
VP YYR D PN KP Y KKVLRTR Y I QTE LG S RERLLVAVLTS RAT LSTL 
AVAVNRTVAHHFPRLLYFTGQRGARAPAGMQWSHGDERPAWLM 
SETLRHLHTHFGADYDWFFIMQDDTYVQAPRLAAIiAGHLSINQD 
LYLGRAEEFIGAGEQARYCHGGFGYLLSRSLLLRLRPHLDGCRG 
D I hS ARPDEWLGRCLI DS LGVGCVSQHQGQQYRS FELAKNRDPE 
KEGSSAFLSAFAVHPVSEGTLMYRLHKRFSALELERAYSEIEQL 
QAQIRNLTVL.TPEGEAGIiSWPVGLPAPFTPHSRFEVLGWDYFTE 
QHT F S C ADG AP KC PLQG AS RAD VG D ALETAL EQLNRR YQP RLR F 
QKQRLLNGYRRFDPARGMEYTLDLLLECVTQRGHRRALARRVSL 
LR P L S R VE 1 L P MP YVT EATR VQ LVL P LL VAE AAAAPAF LE AFAA 
NVLEPREHALLTLLLVYGPREGGRGAPDPFLGVKAAAAEIiERRY 
PGTRLAWLAVRAEAPSQVRLMDWS KKHPVDTLFFLTTVWTRPG 
PEVLNRCRMNAI SGWQAFFP VHFQEFNPALS PQRS PPGPPGAGP 
DPPSPPGADPSRGAPIGGRFDRG^SAEGCFYNADYTiAARARIAG 
ELAGQEEEEALEGLEVMDVFLRFSGLHLFRAVEPGLVQKFSLRD 
CS PRLS EELYHRCRLSNLEGLGGRAQLAMALFEQE QAKST 


6499 


3 


2040 


S CSADTRPSGCAW PTVGLRAAAGAFRTGS PLALG P ET PQ VACLP 
GHPPVRPQVSGGPGAMPDPAAHLP F F YGSI SRAEAEEHLKLAGM 
ADGLFLLRQCLRS LGGYVLSLVHDVRFHHFP I ERQ LNGTYAI AG 
GKAHCGPAELCEFYSRDPDGLPCNLRKPCNRPSGLEPQPGVFDC 
LRD AMVR D YVRQTW KLEG EALEQA IIS Q APQ VEKL I ATTAHE RM 
PW YHS S L TRE EAE R KL YS GAQTDG K FLLRPRKEQGT Y ALS L I YG 
KTVYHYLISQDKAGKYCIPEGTKFDTLWQLVEYLKLKADGLIYC 
LKEACPNSSASNASGAAAPTLPAHPSTLTHPQRRIDTLNSDGYT 
PEPARITSPDKPRPMPMDTSVYESPYSDPEELKDKKLFLKRDNL 
LIADI ELGCGNFGS VRQGVYRMRXKQ IDVAIKVLKQGTEKADTE 
EMMRE AQ I MHQL DNP Y I VRL I G VCQ AEALML VMEMAGGG P L»HKF 
LVGKREE I PVSNVAELLHQVSMGMKYLEEKNFVHRDLAARNVIjL 
VNRKYAKI SDFGLSKALGADDS YYTARSAGK^LKWYAPEC INF 

RKFS s rs dvws ygvtmweals ygqkp ykkmkgpevmafi bqgkr 
mecppecppelyalmsdcwiykwedrpdfltveqrmracyysla 
s kvegp pgstqkaeaaca 


6500 


1773 


726 


TGPTHASACAWGLVRSVTEWCANVRGNPCAAALSCPQAVLDAGK 
MLSES S S FLKG VMLGS I FCAL 1 TMLGH 1 R IGHGNRMHHHEHHHL 
QAPNKEDI LKI SEDERMELSKSFRVYCI 1 LVKPKDVSLWAAVKE 
TWTKHCDKAEFFSSENVKVFESIN^TNDMWLMMRKAYKYAFDK 
YRIX2YNWFFLARPTTFAIIENLKYFLLKKDPSQPFYLjGHTIKSG 
DLEYVGMEGGIVLSVESMKRLNSLLNIPEKCPEQGGMIWKISED 
KQLAVCLKYAGVFAENAEDADGKDVFNTKSVGLSIKEAMTYHPN 

qwegccsdmavtfngltpnqmhvmmygvyrlrafgpyfq 


6501 


1 


570 


lvgmsgggtettpvgceaapgggskkrdslgtagsahliikdlge 
ihsrlldhrpviqgetryfvkefeekrglremrvlenlknmihe 
tnehtlpkcrdtmrdslsqvlqrlqaandsvcrlqqreqerkki 
hsdhlvas ekqhmlqwdnfmkeq pnkraevdeehrkamerlkeq 
yaemekdlakfstf 


6502 


213 


1650 


agnkpdpwagrnrtavlpdvsvfhredvgwmrswlqqsyqavke 

kssealefmkrdlteftqwqhdtactiaataswkeklategs 

sgatekmkkglsdflgvisdtfapspdktidcdvitlmgtpsgt 

aepydgtkarlyslqsdpatycnepdgppelfdawlsqfcleek 

kgeisellvgspsiralytkmvpaavshsefwhryfykvhqleq. 

eqarrdalkqraeqs 1 s eepgweeeeeblmg i sp i s p keakvp v 

akistfpegepgpqspceenlvtsveppaevtpsessesislvt 

qianpatapearvlpkdlsqklleasleeoglavdvgetgpspp 
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IHSKPLTPAGHTGGPEPRPPARVETLRBEAPTDLRVFELNSDSG 
KSTPSNKGKKGSSTDISEDWEKDFDLDMTEEEVQMALSKVDASG 
E VS G PGG S EGS EPNGPGCES S PQP AQLS PQEGPCS CLR 


6503 


213 


1650 


AGNKPDPWAGRNRTAVLPDVSVFHREDVGWWRSWLQQSYQAVKE 
KS S EALE FM KRDLTE FTQWQHDTACT I AATAS WKE KLATEG S 
SGATEKMKKGLSDFLGVISDTFAPSPDKTIDCDVITLMGTPSGT 
AEPYDGTKARLYSLQSDPATYCNEPDGPPELFDAWLSQFCLEEK 
KGEISELLVGSPSIRALYTKMVPAAVSHSEFWHRYFYKVHQLEQ 
EQARRDALKQRAEQS I S EE PGWEEEEEELMG I S PIS PKEAKVP V 
AKI STFPEGEPGPQSPCEENLVTSVEPPAEVTPSESSES ISLVT 
Q I ANPATAPEARVLPKDLSQKLLEAS LEEQGLAVDVGETG PSP P 
IHSKPLTPAGHTGGPEPRPPARVETLREEAPTDLRVFELNSDSG 
KSTPSNNGKKGSSTDISEDWEKDFDLDMTEEEVQMALSKVDASG 
EVSGPGGSEGSEPNGPGCESSPQPAQLSPQEGPCSCLR 


6504 


2131 


1294 


GKVCLVAHWVCLS I LS P P PAGMKTPNAQEAEGQQTRAAAGRATG 
SANMTKKKVSQKKQRGRPSSQPCRNIVGCRISHGWKEGDEPITQ 
WKGTVLDQVP INPSLYL VKYDG IDCVYGLELHRDERVLS LKI LS 
DR VAS SHIS DANLANT 1 1 G KAVEHM FEGEHGS KD E WRGMVLAQA 
PIMKAWFYITYEKDPVLYMYQLLDDYKEGDLRIMPESSESPPTE 
REPGGWDGLIGKHVEYTKEDGSKRIGMVIHQVEAKPSVYFIKF 
DDD FH I YVYDL VKKS 


6505 


2131 


1294 


GKVCLVAHWVCLS ILSPP PAGMKTPNAQEAEGQQTRAAAGRATG 
SANMT KKKVSQ KKQRGR PSSQPCRN I VGCRI SHGWKEGDEPI TQ 
WKGTVLDQVP INPSLYLVKYDGIDCVYGLELHRDERVLSLKILS 
DRVASSHISDANLANTI IGKAVEHMFEGEHGSKDEWRGMVLAQA 
PIMKAWFYITYEKDPVLYMYQLLDDYKEGDLRIMPESSESPPTE 
RE PGG WDGL I GKHVE YT KE DGS KR I GMVTHQ VEAKPS VYF I KF 
DDDFHIYVYDLVKKS 


6506 


1 


1350 


EVSPPTSCCLTVAVADPGVSEGFRGFGAGCEMPGRGRCPDCGST 
ELVEDS HYSQSQL VCS D CGCWTEG VLTTTFS DEGNLRE VTYS R 
STGENEQVSRSQQRGLRRVRDLCRVLQLPPTFEDTAVAYYQQAY 
RHSG I RAARLQKKEVL VGCCVL I TCRQHNWPLTMGAI CTLLYAD 
LDVFSSTYMQIVKLLGLDVPSLCLAELVKTYCSSFKLFQASPSV 
PAKYVE DKE KMLS RTMQL VE LANETWLVTGRH PLP V I TAATFLA 
WQSLQPADRLSCSLARFCKLANVDLPYPASSRLQELLAVLLRMA 
EQIAWLRVLRLDKRSWKHIGDLLQHRQSLVRSAFRDGTAEVET 
REKEPPGWGQGQGEGEVGNNSLGLPQGKRPASPALLLPPCMLKS 
PKRICPVPPVSTVTGDENISDSEIEQYLRTPQEVRDFQRAQAAR 
QAATSVPNPP 


6507 


1878 


929 


RSHASRLPELPSGCLVLQVQELVQMSGMEATVTIPIWQNKPHGA 
ARSVVRRIGTNLPLKPCARASFETLPNISDLCLJIDVPPVPTLAD 
I AW I AADEEETYAR VRSDTRPLRHTWKPS PLI VMQRNAS VPNLR 
GSEERLLALKKPALPALSRTTELQDELSHLRSQIAKIVAADAAS 
ASLTPDFLSPGSSNVSSPLPCFGSSFHSTTSFVISDITEETEVE 
VPELPSVPLLCSASPECCKPEHKAACSSSEEDDCVSLSKASSFA 
DMMGI LKD FHRMKQ SQDLNR S LL KE ED P AVL I SEVLRRKFALKE 
EDISRKGN 


6508 


662 


342 


WEARKR PQRW PSERRE VRVPPPHLQRGRSGLE PGTFRKMAAARP 
SLGRVLPGSSVLFLCDMQEKFRHNIAYFPQIVSVAARMLKNTTL 
DLLDRGLQVHWVDACSSRSQVDRLVALARMRQSGAFLSTSEGL 
I LQLVGDAVHPQFKEI QKLI KEPAPDSGLLGLFQGQNSLLH 


6509 


2 


1053 


FVWNPRGGR KRRRQAAVTQAATRASGTPS PRDGTMTQGKLS VAN 
KAPGTEGQQQVHGE KKEAPAVPSAP PSYEEATSGEGMKAGAFPP 
APTAVPLHPSWAYVDPSSSSSYDNGFPTGDHELFTTFSWDDQKV 
RRVFVRKVYTILLIQLLVTLAVVALFTFCDPVKDYVQANPGWYW 
AS YAVFFATYLTLACCSGPRRHF P WNLI LLTVFTLSMAYLTGML 
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S S Y YNTT S VL LCLG I TALVCLS VT V FS FQTKFDFT S CQG VL F V L 
LMTLFFSGLILAILLPFQYVPWLHAVYAALGAGVFTLFLALDTQ 
LLMGNRRHS LS PE E Y I FGALN I YLD 1 1 Y I FT FFLQL FGTNRE 


6510 


37 


1156 


PCALDGCPQRGAVHPLLSSAMGLLAFLKTQFVLHLLVGFVFVVS 
G L V I N FVQL CTLALW P VS KQ L YRRLNCRLAYS LWS Q L VMLLE WW 
S CTECTL FTDQAT VE RFGKEHAV 1 1 LNHNFE I DFL CG WTMCE R F 
QVLQSSK VLAXKELL YVPL IGWTWYFLE I VFCKRKWEEDRDTW 
EGLRRLSDYPEYWWFLLYCEGTRFTETKHRVSMEVAAAKGLPVL 
KYHLLPRTKG FTTAVKCLRGTVAAVYDVTLNFRGNKNPS LLG I L 
YGKKYEADMCVRR FPLEDI PLDEKEAAQWLHKLYQEKDALQE I Y 
NQKGMFPGEQFKPAPJIPWTLLNFLSWATILLSPLFSFVLGVFAS 
G S P LLI LT F LG FVG AGNGHCR 


6511 


2541 


1425 


GEEQPLAAAPTECLEQVIGGAGDPGTWASFPSPLPGPAPLKGGK 
TMATNFSDIVKQGYVKMKSFJOX3IYRRCWLVFRKSSSKGPQRLE 
KYPDEKSVCLRGCPKVTEISNVKCVTRLPKETKRQAVAI I FTDD 
SARTFTCDSELEAEEWYKTLSVECLGSRLNDISLGEPDLLAPGV 
QCEQTDR FNVFLLPCPNLDVYGECKLQI THENI YLWD I HNPRVK 
L VS WPLCS LRR YGRDATR FT FEAGRMCDAGEGL YTFQ TQEG EQ I 
YQRVHSATLA IAEQHKRVLLEMEKNVRLLNKGTEHYSYP CTPTT 
MLPRSAYWHHITGSQNIAEASSYAGEGYGAAQASSETDLLNRFI 
LLKPKPSQGDSSEAKTPSQ 


6512 


159 


807 


FGKKSTWFPLSRSLRVASGRSCKLGHGGYTGSGPGFGEPRDSGA 
EVPSGSGRATGCERGGVRGARQGRAPGSSIWRKEPRMVCTRKTK 
TLVSTCVILSGMTNI I CLLYVGWVTNYIASVYVRGQEPAPDKKL 
EEDKGDTLKI I ERLEHLENVI KQH I QEAPAKP EE AEAEP FTDSS 
LFAHWGQELSPEGRRVALKQFQYYGYNAYLSDRLPLDRP 


6513 


2 


756 


FVS PE ?GF £ LAQLNL I WQ LTDT KQ L VHS FAEGQDQGSA YANRTA 
LFPDLLAQGNAS LRLQRVRVADEGS FTCFVS I RDFGSAAVS LQ V 
AAP YS KPSKTLEPNKDLRPGDTVTI TCSSYQGYPEAEVFWQDGQ 
GVPLTGNVTTSQMANEQGLFDVHS I LRWLGANGT YSCLVRNP V 
LQQDAHSSVTITPQRSPTGAVEVQVPEDPVVALVGTDATLRCSF 
SPEPGFSLiAQLNLIWQLTDTKQLVHSFAEGQDQGSAYANRTAtiF 
PDLLAQGNAS LRLQRVRVADEGS FTCFVS I RDFGSAAVS LQVAA 
P YS KP SMTLE PNKDLR PGDTVT I T CS S YQG Y P EAE VFWQ DGQG V 
PLTGNVTTSQMANE QG L FDVHS I LR WLGANG TYS CL VRN P VLQ 
QDAHS S VTI T PQ R S PTGAVE VQVP EDP WALVGTDATLR CS FS P 
EPGFS LAQLNL I WQLTDTRQLVHS FTEGR 


6514 


985 


302 

• 


VGIPGPTISSAAEMEDLLDLDEELRYSLATSRAKMGRRAQQESA 
QAENHLNGKNSSLTLTGETSSAKLPRCRQGGWAGDSVKASKFRR 
KASEEI ED FRLRPQS LNGS DYGGD I P 1 1 PDLEE VQEEDF VLQ VA 
APPS I Q I KRVMT YRDLDNDLMKYSAI QTLDGE I DLKLLTKVLAP 
EHE VRE RN PS WQ DD VGWDWDHL FTEVS S E VLTE WDPLQTE KED ? 
AGQARHT 


6515 


1345 


305 


GRVGSRRRGAAVPGGCGAGS TQLEVSAS ASCGALGS ADMNP I W 
VHGGGAGP ISKDRKERVHQGMVRAATVGYGILREGGSAVDAVEG 
AWALE DD P E FNAGCGSVLNTNGEVEMDAS I MDGKDLS AG AVS A 
VQCIANP I KLARLVMEKTPHC FLTDQGAAQFAAAMG VPEI PGE K 
L VTERN KKR L E KE KH K KG AQ KT D CQ KNLGTV G AVAL DC KGNVA Y 
ATSTGG I VNKMVGRVGDS PCLGAGG YADND I GAVSTTGHGES I L 
IO^TIjARLTLFHIEQGKTVEEAADLSLGYMKSRVKGLGGLIVVSK 
TGDWVAKWTS TSMP WAAAKDG KLHFG I DPDDTT ITDLP 


6516 


1 


1402 


FRRLRYLGQDATAAARDLRTRGLQGYCPSATARQQVLVSALQQL 
KGRRSEHRNENQEMPYSTNKELILGI^4VGTAGISLLLLWYHKVR 
KPGIAMKLPEFLSI^NTFNSITLQDEirTODQGTTVIFQERQLQI 
LEKLNELLTNMEELKEEIRFLKEAIPKLEEYIQDELGGKITVHK 
ISPQHRARKRRLPTIQSSATSNSSEEAESEGGYITANTDTEEQS 



515 



WO 01/53312 



PCT7US00/34263 



SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to firat 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
n-niaLiumc, x — isoieucilie , K^Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q*Glut amine, R-Arginine, 
S=Serine, ^Threonine, V= Valine, 
w«Tryptophan, Y=Tyrosine, X -Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








FPVPKAFNTRVEELNLDVLLQKVDHLRMSESGKSESFELLRDHK 
EKFRDE I E FMWRFARAYGDMYELSTNTQEKKHYANIGKTLS ERA 
INRAPMNGHCHLWYAVLCGYVSEFEGLQNKINYGHLFKEHLDIA 
I KLLPEE P FLY YLKGR YCYTVS KLS WI EKKMAATLFGKI PS STV 
yivttijrtiN r jj^AbKLit-Pva xfaNFN xMYLAKCYTDLEENQNALKFCNL 
ALLLPTVTKEDKEAQKEMQK I MTSLKR 


6517 


3 


1414 


GRVWGGSSSLNAMVYVRGHAEDYERWQRQGARGWDYAHCLPYFR 

kaqghelgasryrgadgplrvsrgktnhplhcafleatqqagyp 
ltedkngfqqegfgwmdmtihegkrwsaacaylhpalsrtnlka 
eaetlvsrvlfegtravgvewkngqshrayaskevilsggain 
S PQLLMLSG I GNADDLKKLG I P wchlpgvgqnlqdhle iyiqq 
actrpitlhsaqkplrkvciglewlwkftgegatahletggfir 
sqpgvphpdiqfhflpsqviphgrvptqqeayqvhvgpmrgtsv 

GWLKLRSANPQDHPVIQPNYLSTETDIEDFRLCVKLTREIFAQE 
AI^PFRGKE1K3PGSHIQSDKEIDAFVRAKADSAYHPSCTCKMGQ 
PSDPTAVVDPQTRVLGVENIiRVVDASIMPSMVSGNLNAPTIMIA 
EKAADI I KGQPALWDKDVPVYKPRTLATQR 


6518 


242 


1098 


PAWNPGSEPRTRVRPRARSFPLPPPRAPRRRRHRLLRAVPGPSR 
RHR CR RRA P P P PSTMGDAGS ERSKAPSLPPR CP CG FWGSSKTMN 
LCS KCFADFQKKQPDDDS APSTSNSQSDLFSEETTSDNNNTS IT 
TPTLS P SQQ PL PTELNVTS PS KEE CG P CTDTAHVS LIT PTKRSC 
GTDSQSENEASPVKRPRLLENTERSEETSRSKQKSRRRCFQCQT 
KLELVQQELGSCRCGYVFCMLHRLPEQHDCTFDHMGRGREEAIM 
KMVKLDRKVGRS CQRI GEGCS 


6519" 


3 


1113 


ERKMAEPP S P VHCVAAAAPTATVS E KEP FGKLQLSSRDPPGSLS 
AXKVR TEEKKAPRRVNGEGGSGGNS RQLQ P PAAPS PQS YGS PAS 
WSFAPLSAAPSPSSSRSSFSFSAGTAVPSSASASLSQPGPRKLL 
VP PTLLHAQPHHLLLPAAAAAAS ANAKSRRPKE KREKERRRHGL 
GGAREAGGASREENGEVKPLPRDKIKDKIKERDKEKEREKKKHK 
vm« h 1 XKiiWGE VKI LLKSG KEKP KTN I EDLQ I KKVKKKKKKKHK 
ENEKRKRPKMYSKSIQTICSGLLTDVEDQAAKGILNDNIKDYVG 
KNLDTKNYDSKIPENSBFPFVSLKEPRVQNNLKRLDTLEFKQLI 
HIEHQPNGGASVIHCLQ 


6520 


3 


1113 


ERKMAE P PSP VHCVAAAAPTATVS EKE PFGKLQLSSRDPPGSLS 
AKKVRTEEKKAPRRVNGEGGSGGNSRQLQPPAAPSPQSYGSPAS 
WSFAPLSAAPSPSSSRSSFSFSAGTAVPSSASASLSQPGPRKLL 
VPPTLLHAQPHHLLLPAAAAAASANAKSRRPKEKREKERRRHGL 
GGAR EAGGASRE E NG E VKP L PRDK I KDKI KERDKE KEREKKKHK 
VMNE I KKENGEVKI LLKSGKEKPKTNIEDLQI.KKVKKKKKKKHK 
ENEKRKRPKMYSKS I QTI CSGLLTDVEDQAAKGI LNDNIKD Y VG 
KNLDTKNYDS KI PENS EFP FVSLKEPRVQNNLKRLDTLEFKQL I 
HIEHQPNGGASVIHCLQ 


6521 


184 


1798 


j\_Lir ispim.! u I o^^JiJjVrirlsAlirLiI VGAQLIHADKLGEKVEDSTMP 
I RRTVNSTRETP P KS KLAEGEEEKPE PDI SS EES VST VEEQENE 
TPPATSSEAEQPKGEPENEEKEENKSSEETKKDEKDQSKEKEKK 
VKKTIPSWATLSASQLARAQKQTPf4ASSPRPKMDAILTEAIKAC 
FQKSG AS WA IRKYIIHFCYPS LE LERRG YLL KQALKRE LNRG VI 
KQVKGKGASGSFVWQKSRKTPQKSRNRKNRSSAVDPEPQVKLE 
DVLPLAFTRLCEPKEASYSLIRKYVSQYYPKLRVDIRPQLLKNA 
LQRAVERGQLEQITGKGASGTFQLKKSGEKPLLGGSLMEYAILS 
AI AAMNE P KTCSTTALKKYVLENH PGTNSNYQMHLL KKTLQKCE 
KNGWMEQISGKGFSGTFQLCFPYYPSPGVLFPKKEPDDSRDEDE 
DEDESSEEDSEDEEPPP KRR LQKKTP AKS PG KAAS VKQRG S KPA 
PKVS AAQRGKARPLPKKAPPKAKTPAKKTRP S STV I KKPSGGSS 
KKPATSARKE 


6522 


1042 


391 


NKWLRPSPRSHRTPESGRVLSLFRLPPPGMALSGSTPAPCWEED 
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beg inning 
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amino acid 
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amino acid 
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Predicted end 
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corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C»Cysteine, D»Aspartic Acid/ E*= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine , 
S=Serine, T=Threonine, V=Valine, 
W*Tryptophan, Y«Tyro9ine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








E C LD Y YGM LS LHRMFE VVGGQLTE CEL E LLAFLLD EAP GAAGGL 
SRARSGLKLLLELERRGO^ESNIiRLLGQLLRVLARHDLLPHLA 
RKRRRPVSPERYSYGTSSSSKRTEGSCRRRRQSSSSANSQQGSP 
PTKRQRRSRGRPSGGARRRRRGPQPHPSSSQSPPDLPLKAK 


6523 


2 


1097 


AS COTRRRTAALDSGER I AGRRS P I ALAMASNFNDI VKQGYVKI 
RSR KLG I FRRCWLVFKKASS KGPRRLBKFPDEKAAYFRNFHKVT 
ELHNI KNI TRLPRETKKHAVA1 1 FHDETSKTFACESELEAEEWC 
KHLCMECLGTRLNDISIiGEPDLLAAGVQREQNERFNVYLMPTPN 
LD I YGECTMQI THENI YLWD I HNAKVKLVMWPLSSLRRYGRDST 

WPTFFQflRMPnTnprJT.FTFnTRFRRMT YnTTVH^ATIiRTaF'nHFP 
nr if CievIU 1 u&uur iry i j\.civ7Cir'i ± i UAvnoniiini/iCiynuA 

LMLEMEQKARLQTSLTEPMTLSKS I SLPRSAYWHHITRQNSVGE 
IYSLQGNHENRHSDLTGKSCKTSENRFLEENAPLVMYGITHHLF 
MDTS TC KWHDLE 


6524 


2 


1097 


ASCQTRRRTAALDSGERI AGRRS PIALAMASNFNDI VKQGYVKI 
RSRKLGIFRRCWLVFKKASSKGPRRLEKFPDEKAAYFRNFHKVT 
ELHNIKNITRLPRETKKHAVAI I FHDETSKTFACESELEAEEWC 
KHLCMECLGTRL^ISLGEPDLLAAGVQREQNERFNVYLMPTPN 
LD I YGECTMQ I THEN I YLWD I HNAKVKLVMWPLSSLRRYGRDST 
WFTFESGRMCDTGEGLFTFQTREGEMIYQKVHSATLAIAEQHER 
LMLEMEQ KARLQTS LTE PMTLS KS I S LPRSAYWHHI TRQNS VGE 
IYSLQGNHENRHSDLTGKSCKTSENRFLEENAPLVMYGITHHLF 
MDTSTCKWHDLE 


6525 


1 


1853 


GESPFSEEESIEFNPSSSGRSARTVSSNSFCSDDTGWPSSQSVS 
PVKTPSDAGNSPIGFCPGSDEGFTRKKCTIGMVGEGSIQSSRYK 

NRGPHGRSNGASSHKPGS SPSS PRE KDLLSMLCRNQLSPVNIHP 
SYAPSSPSSSNSGSYKGSDCSPIMRRSGRYMSCGENHGVRPPNP 
EQYLT PLQQ KE VTVRHLKTKLKES ERRLHERES E I VELKSQLAR 
MREDWIEEECHRVEAQLALKEARKEIKQLKQVIETMRSSLADKD 
KG I Q KYFVD IN I QNKKLE S LLQS MEMAHSGS LRDELCLD F P CDS 
PEKS LTLNPPLDTMADGLS LEEQVTGEGADRELLVGDS I ANSTD 
LFDE I VTATTTESGDLE LVHSTPGANVLELLP I VMGQEEGS VW 
ERAVQTDWPYSPAISELIQSVLQKLQDPCPSSLASPDESEPDS 
MES FPESLS ALWDLTPRNPNSAI LLS PVETPYANVDAEVHANR 
LMRELDFAACVEERLDGVIPLARGGWRQYWSSSFLVDLLAVAA 
P WPTVLWAFS TQRGGTD PVYNI GAL LRGCCVVALHSLRRTAFR 
IKT 


6526 


2 


2034 


SGRAGE PEEWRGRQ 1 1 DS KETW I P FNS EDSQQLEEAYSS GKGCN 
GRWPTDGGRYDVHLGERMRYAVYWDELASEVRRCTWFYKGDKD 
NKYVPYSES FSQ VLEETYMLAVTLDEWKKKLESPNREII ILHNP 
KIjMVHYQPVAGSDDWGSTPMEO/SRPRTVKRGVENISVDIHCGEP 
LQ 1 DHLVFWHG I G P ACDLR FRS I VQCVNDFRS VS LNLLQTH F K 
KAQENQQIGRVEFLPVNWHSPLHSTGVDVDLQRITLPSINRLRH 
FTNDTILDVFFYNSPTYCQTIVDTVASEMNRIYTLFLQRNPDFK 
GGVSIAGHSLGSLILFDILTNQKDSLGDIDSEKGSLNIVMDQGD 

tptleedlkklqlseffdifekekvdkealalctdrdlqeigip 

LGPRKKILNYFSTRKNSMGIKRPAPQPASGANIPKESEFCSSSN 

trngdyldvgigqvsvkyprliykpeiffafgspigmfltvrgl 
kridpnyrfptckgffniyhpfdpvayriepmwpgvefepmli 
phhkgrkrmhlelregltrmsmdlknnllgslrmawksftrapy 
palqasetpeeteaepestsekpsdvnteetsvavkeevlpinv 
gmlnggqridyvlqekpiesfneylfalqshlcywesedtvllv 
lkeiyqtqgifldqplq 


6527 


1 


922 


gwvpllsrilpsdackiykqginirldttlidftdmkcqrgdls 
fifngdaapsesfwldneqkvyqrihheesembteeevdilms 

SDI YS ATLSTKS I S FTRAQTGWLFREDKTERVGNFLADFYLVNG 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(As=Alanine, C*Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F» Phenylalanine, G-Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R^Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y=Tyrosine , X=Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








LVLESRKRREHLSEEDILRNKAIMESLSKGGNIMEQNFEPIRRQ 
SLTPPPQNTITWSEYISAENGKAPHLGRELVCKESKKTFKATIA 
MSQEFPLGIELLLNVLEWAPFKHFNXLREFVQMKLPPGFPVKL 
DIPVFPTITATVTFQEFRYDEFDGSIFTIPDDYKEDPSRFPDL 


6528 


1 


1073 


LTGPAAAEPRCAADAGMKRALGRRKGVWLRLRKI LFCVLGLYIA 
I PFL I KLCPGIQAKLI FLNFVRVPYFIDLKKPQDQGLNHTCNYY 
LQPEEDVTIGVWHTVPAVWWKNAQGKDQMWYEDALASSHP I ILY 
LHGNAG TRGGDHR VEL Y KVLS S LG YHWTFD YRG WGDSVGTP S E 
RGMTYDALHVFDW I KARSGDNPVY I WGHS LGTGVATNLVRRLCE 
RETPPDALILESPFTNIREEAKSHPFSVIYRYFPGFDWFFLDPI 
TSSGI KFANDENVKHI S CPLLI LHAEDDPWPFQLGRKLYS I AA 
PARSFRDFKVQFVPFHSDLGYRHKYIYKSPELPRILREFLGKSE 
PEHQH 


6529 


363 


2215 


THIRYNKIGWKTMSCGNEFVETLKKIGYPKADNLNGEDPDWLF 
EGVEDESFLKWFCGNVNEQNVLSERELEAFS ILQKSGKP ILEGA 
ALDE ALKTCKTS DL KTPRLDD KELEKLEDE VQTLLKLKNLK I QR 
RNKCQLMASVTSHKSLRLNAKEEEATKKLKQSQGILNAMITKIS 
NELQALTDEVTQLMMFFRHSNLGQGTNPLVFLSQFSLEKYLSQE 
EQSTAALTLYTKKQFFQGIHEWESSNESQFFNFLKIQTPS I CD 
NQEILEERRLEMARLQLAYICAQHQLIHLKASNSSMKSSIKWAE 
ESLHSLTSKAVDKENLDAKI SSLTSE IMKLEKEVTQ I KDRS LPA 
WRENAQLLNMPWKGDFDLQIAKQDYYTARQELVLNQLIKQKA 
SFELLQLSYEIELRKHRDIYRQLENLVQELSQSNMMLYKQLEML 
TDP S VSQQINP RNT I DTKD YS THRL YQ VLEGENKKKE LFLTHGN 
LEEVAE KLKQN I S LVQDQLAVS AQEHS FFLSKRNKDVDMLCDTL 
YQGGNQLLLSDQE LTEQ FHKVE S QLNKLNHLLTD I LADVKTKRK 
TLANNKLHQMEREFYVYFLKDEDYLKDIVENLETQSKIKAVSLE 
D 


6530 


128 


2986 


GAAHHGAI VQVHPLLPGS ST IM IHDLCLVFPAPAKAWYVS D IQ 
E LY I R WDKVE I GKTVKAYVR VLDLHKKP FLAK YFP FMDLKLRA 
ASPIITLVALDEALDNYTITFLIRGVAIGQTSLTASVTNKAGQR 
INSAPQQIEVFPPFRLMPRKVTLLIGATMQVTSEGGPQPQSNIL 
FSISNESVALVSAAGLVQGLAIGNGTVSGLVQAVDAETGKWir 
S QDL VQ VEVLLLRAVR IRAP I MRMRTGTQMP I YVTG I TNHQN P F 
S FGNAVPGLTFHWS VTKRDVLDLRGRHHEAS IRLPSQYNFAMNV 
LGRVKGRTGLRAWKAVDPTSGQLYGLARELSDEIQVQVFEKLQ 
LLNPE I EAEQILMS PNS YI KLQTNRDGAAS LS YRVLDGPEKVP V 
VHVDEKGFLASGSMIGTSTIEVIAQEPFGANQTIIVAVKVSPVS 
YLRVSMSPVLHTQNXEALVAVPLGWTVTFTVHFKDNSGDVFHAH 
SSVLNFATNRDDFVQIGKGPTNNTCVVRTVSVGLTLLRVWDAKH 
PGIiSDFMPLPVLQAISPELSGAMVVGDVLCLATVLTSLEGLSGT 
WSSSANS ILHIDPKTGVAVARAVGSVTVYYEVAGHLRTYKEWV 
S VPQR I MARHLHP I QTS FQEATAS KVI VAVGDRS SNLRGECTPT 
QREVIQALHPETLlSLQSQFKPAVFDFPSQDVr 1 VEPQFDTALG 
QYFCSITMHRLTDKQRKHLSMKKTALWSASLSSSHFSTEQVGA 
E VP FS PGLFADQAE I LLSNH YTSSEIRVFGAPEVLENLE VKSGS 
P AVLAFAKEKSFG WPS FITYTVGVLDPAAGSQGPLS TTLTFS S P 
VTNQA1AI PVTVAFWDRRGPGPYGASLFQHFLDS YQVMFFTLF 
ALLAGTAVMI IAYHTVCTPRDLAVPAALTPRASPGHS PHYFAAS 
S PTSPNALP PARKAS PPSGLWS PAYASH 


6531 


845 


1425 


PSASIPPSASPDPVPDIRTCHFCLVEDPSVGCISGSEKCTISSS 
S LCMV I T I YYDVKVRFJ VRG CGQY I S YRCQE KRNT YFAE YW YQA 
QCCQYDYCNSWSSPQLQSSLPEPHDRPLALPLSDSQIQWFYQAL 
NLSLPLPNFHAQTEPIX3LDPM^LSriNLGLSFAELRRMYLFLNS 
SGLLVLPQAGLLTPHPS 


6532 


2 


954 


AAGPPSEWNQDSLFPEPEPGPAPQVLLGPQGPGLIKGVAPPTL 
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to first 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, YsTyroeine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\-poesible nucleotide insertion) 









ITDS TGTHL VLTVTN KNAHS PGLS RGSPQQP S SQPGS PAP APS A 
QMDLEHPIjQPLFGTPTSLLKKEPPGYEEAMSQQPKQQENGSSSQ 
QMDDLFDIL I QSGE I S ADFKEPPS L PGKEKPS PKTVCWS PLAAQ 
PSPSAELPQAAPPPPGSPSLPGRLEDFLESSTGLPLLTSGHDGP 
EPLSLIDDLHSQMLSSTAILDHPPSPMDTSELHFVPEPSSTMGL 

LQLHWDSCL 


6533 


1798 | 


373 


STISWU^VEPPRRSSGVGAARIiRFPGGSRPLWJ^CVIiAliAVL 
ALLERNNADSMS AHSMLCER I AIAKELI KRAESLSRS RKGG IEG 
GAKLCS KLKAELKFLQKVEAGKVAI KESHLQSTNLTHLRAI VES 

7\ T?XTT f f TrtrOTTT UTTCr* V T T r n r PT _0 C 1 VOTT AArn\A7B MnnUTtJ^ TV* Tf~"D 
AiLNJjtJtj V va v LM Vf \a x 1U1 UctlLr^J X Jj V V1JV ViUilJunl W VRAlbK 

KAEALHNI WLGRGQYGDKS 1 1 EQAEDFL QASHQQ P VQ YSNPH 1 1 
FAFYNSVSSPMAEKLKEMGI S VRGD IVAVNALLDHPEELQPSES 
ESDDEG PELLQVTRVDREN1 LASVAFPTE I KVDVCKRVNLDITT 
LITYVSAIiSYGGCHFIFKEKVLTEQAEQERKEQVLPQLEAFMKD 
KELFACESAVKDFQS I LDTLGGPGERERATVL I KR INWPDQPS 

FP AT .P T . VA ^ Q Y T W P <5 T ,T T FRTCJDT T ■ K" A T TMT AN 9 G FVR AANND 
G VKF S V F I HQ P RALTES KE A1ATPL P KD YTTD S EH 


6534 


47 


596 


KATRF I SAAFWLNKQG VS PAKLPHTSWS WSLQTLS FLFSGDLA 
E KS LQ C FP CS AMLLE LI P LLG I HFVLRTARAQ S VTQ PDIH I TVS 
EGASLELRCNYSYGATPYLFWMERTVEEAFILLVCLKPWRVASS 
LEKKEKEDESFQLLLGSRYNVLKAHCLLPLIRWLTSGDSLLSAQ 
PHCPQGL 


6535 


250 


964 


LIKTFFRDVAIQRDLLPKEKNLETLLTLAFLEIDKAFSSHARLS 
anATT t TQ^TTaTVAr.r.pnnTrr.wacivfinQPATT.r'PK'rt^DMK'r. 

t\L}r\ i LiLtl obi I/ii v M l it ilvLAjl C<Liv V J-LO V ulJoRiil Xj ^l\.iVVjtIxir 1"] 

TIDHTPERKDEKERIiCKCGGFVAWNSLGQPHVNGRLAMTRSIGD 
LDLKTS G VI AE PET KR I KLHHADDS FL VLTTDG INFMVNSQE I W 
DF VNQ CHD PNEAAHAVTEQAI Q YGTE DNS TA WVP FG AWG K YKN 
S E I NFS FS RS FASSGR WA 


6536 


242 


1174 


SLVKEMTNQYGI LFKQEQAHDDAI WSVAWGTNKKENS ETWTGS 

T nnT A7VVWVT^nWPT.T^T/^WQT.T?nMnT/2VVQ\/riTQ'P T rT.PT ABQQQ 

XlUULi" J\ V r4 MAI KUuIUJiyiJU" O JjCiVJf*\Ji-r\J V VOVUlOUiDf l/ViD o o 

LDAHIRLWDLENGKQ I KS IDAGPVDAWTLAFSPDSQYLATGTHV 
G KVNI FG VE S GKKE Y S LDTRG KF I LS I AYS P DGKYLASGA I DG I 
INIFDIATGKLLHTLEGHAMPIRSLTFS PDSQLLVTASDDGYIK 
I YDVQHANLAGTLS GHASWVLNVAFCPDDTHFVSS S S DKS VKVW 
DVGTRTCVHTFFDHQDQVWGVKYNGNGSKIVSVGDDQEIHIYDC 
PI 


6537 


1638 


921 


NRFNPP PTQGPDPS LVYRPD VD PEVAKD KAS FRNYTS GPLLDRV 
PTTVIfT MWTHnTVnVVPQlfHAOFGGF^YKKMTVMEAVDTjLnGT.V 
DE S D PDVDF PNS FHAFQTAEG I RKAHPD KD W FHLVGLLHDLGKV 
LAL FGE PQ WAWGDT F P VGCRPQAS WFCDS T FQDNP DLQDPR Y 
STELGMYQPHCGLDRVLMSWGHDGEARGGQWGGGQRWGTVGGGG 
AEAVPAGDTLSPQSTCTR 


6538 


3345 


2412 


P YLYDFLDAL I TCQTAPEEAF I ICLDGLAGMLTEQLRRLTKQVQE 
ARHNRDDEAI KKAVNE YDETMEXYI PVLMAQAKI YWNLENYPMV 
E K I FRKSVE FCNDHD WXINVAHVL FMQENKYXEAIG FYEP I VK 
KHYDNI1^SAIVIANLCVSYIMTSQNEI(AEE1>IRKIEKEEEQL 
S YDDPNRKMYHLC I VNL VI GT L YCAKGNYE FG I S RV I KS LE P YN 
KKLGTDTWYYAKRCFLSLLENMSKHMIVIHDSVIQECVQFLGHC 
BLYGTNIPAVIEQPLEEERMHVGKNTVTDESRQLKALIYEIIGW 
NK 


6539 


218 


339 


FLGAASPHPHFSSLAPHPDQPEFTPVQDELEAMELWGPGV 


6540 


3 


391 


LERLWIXLLRRPEDAMAECPTLGEAVTDHPDRLWAWEKFVYLDE 
KQHAWLPLT IE I KDRLQLRVLLRREDWLGRPMTPTQ IGPSLLP 
IMWQLYPDGRYRSSDSSFWRLVYHIKIDGVEDMLLELLPDD 



519 



WO 01/53312 



PCTYUS00/34263 



S2Q 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
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amino acid 
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amino acid 
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Predicted end 
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Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine , G=Glycine, 
H=Histidine, I*Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y« Tyrosine, X=»Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 


6541 


1165 


536 


RT LVQRR I LMLLR KPARGRDLRGRGRGT PRGGRKGL L PTPDE F P 
R FEGGR KPDS WDGNRE PGPGHEH FRDT P R PDHPPKDGHS P AS RE 
RSSSLQGMDMASLPPRKRPWHDGPGTSEHREMEAPGGPSEDRGG 
KGRGGPGPAQRVPKSGRSSSLDGEHHDGYHRDEPFGGPPGSGTP 
SRGGRS GS NWGRG S NMNSGP P RRGAS RGGGRGR 


6542 


3 


3775 


S W P RGRG E TGGH PG ALRTRTM Q KS VR YNEGHAL YLAF LARKBGT 
KRGFLS KKTAEASRWHEKWFAL YQNVLF YFEGEQSCRPAGM YLL 
EGCSCERTPAPPRAGAGQGGVRDALDKQYYFTVLFGHEGQKPLE 
LRCEEEQDGKEWMEAIHQASYADILIEREVLMQKYIHLVQIVET 
EKIAANQLRHQLEDQDTEIERLKSEIIALNKTKERMRPYQSNQE 
DEDPDIKKI KKVQS FMRGWLCRRKWKTI VQDYI CSPHAESMRKR 
NQIVFTMVEAESEYVHQLYILVNGFLRPLRMAASSKKPPISHDD 
VSSIFLNSETIMFLHEIFHQGLKARIANWPTLILADLFDILLPM 
LNI YQEFVRNHQ YS LQ VLANCKQNRDFDKLLKQ YEANPACEGRM 
LETFLTY PMFQ I PR YI I TLHELLAHTPHEHVERKSLEFAKSKLE 
ELS R VMHDEVSDT EN I R KNLAI ERM I VEG CD I LLDTS QT F I RQG 
SLIQVPSVERGKLSKVRLGSLSLKKEGERQCFLFTKHFLICTRS 
SGGKLHLLKTGGVLSLIDCTLIEEPDASDDDSKGSGQVFGHLDF 
KI WE PPDRAAFTWLLAPSRQEKAAWMSDI SQCVDNIRCNGLM 
TIVFEENSKVTVPHMIKSDARLHKDDTDICFSKTLNSCKVPQIR 
YAS VERLLERLTDLRFLS IDFLNTFLHTYRI FTTAAWLGKLSD 
I YKRPFTS IPVRSLEIiFFATSONNRGEHLVDGKS PRf.n? KV<Z qd 
PPLAVSRTSSPVRARKLSLTSPLNSKIGALDLTTSSSPTTTTQS 
P AAS P P PHTG QI PLD LS RGLS S PEQS P GTVE ENVDNPR VDLCNK 
LKRS I QKAVLES APADRAG VES S PAADTTELS P CRSPSTPRHLR 
YRQPGGQTADNAHCSVS PASAFAIATAAAGHGS P PGFNNTERTC 
DKEFI IRRTATNRVLNVLRHWVSKHAQDFELNNELKMNVLNLLE 
EVLRDPDLLPQERKAAANILMALSQDDQDDIHLKliEDIIQMTDC 
MKAECFESLS AMELAEQ I TLLDHVI FRS I PYEEFLGQGWNKLDK 
NERTPYIMKTSQHFNDMSNLVASQIMNYADVSSRANAIEKWVAV 
AD I CRCLHNYNGVLE I TSALNRSAI YRLKKTWAKVS KQTKALMD 
KLQKTVSSEGRFKNLRETLKNCNPPAVPYLGMYLTDEjAFIEEGT 
PNFTEEGLVNFS KMRM I SH I I RE IRQFQQTS YR I DHQPKVAQ YL 
LDKDLI IDEDTLYELSLKIEPRLPA 


6543 


1857 


950 


F VS GCGRAG I GLS WAMAAE AR VS RW YFGGLAS CGAACCTH P LDL 
LKVHI^TQQEVKLRMTGMALRVVRTDGILALYSGLSASLCRQMT 
YSLTRFAIYETVRDRVAKGSQGPLPFHEKVLLGSVSGLAGGFVG 
TPADLVNVRMQNDVKLPQGQRRNYAHALDGLYRVAREEGLRRLF 
SGATMAS SRGALVT VGQLS CYDQAKQLVLSTG YLS DNI FTHFVA 
S F IAGGCATFLCQPLDVLKTRLMNS KGEYQGVFHCAVETAKLG P 
LiAFYKGLVPAG I RLI PHTVLTFVFLEQLRKNFG I KVP S 


6544 


630 


79 


PSPCFIRSRLDGQPWMAGLEAWLSQNFSLHQPQSRVRVRRASIS 
EPSDTDPEPRTLNPSPAGWFVQQHPELELMSSFRERFGRNWLQY 
RSHLEPSGNPLPATPTTSAPSAPPASSQGPDTAPRPSPPQEEAR 
GPQES PQKMSEEVRAEPQEEEE3KEGKEEKEEGEMAPLPEAHLG t 
EGKQKECP 


6545 


176 


560 


PPHSHAALLPAAMTPLLTLILWI^GLPLAQALDCHVCAYNGDN 
CFTTPMRCPAMVAYCMTTRTYYTPTRMKVSKSCVPRCFErVYDGY 
S KHAS TTS CCQ YDLCNGTGLATPATLALAP I LLATLWGLL 


6546 


1657 


364 


HLLNGLDEVAAFFVADLGAIVRKHFCFLKCLPRVRPFYAVKCNS 
SPGVLKVLAQLGLGFSCANKAEMELVQHIGIPASKIICANPCKQ 
IAQIKYAAKHGIQLLSFDNEMELAKWKSHPSAKMVLCIATDDS 
HS LS CLSLKFGVS LKS CRHLLENAKKHHVEWGVSFKIGS GCPD 
PQAYAQS I ADARLVFEMGTELGHKMHVLDLGGGFPGTEGAKVRF 
EE I AS V INS ALDL YFPEGCGVD I FAELGR YYVTS AFTVAVS 1 1 A 
KKE VLLDQPGREEENG S TS KTI VYHLDEG VYG I FNS VLFDNI CP 
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corresponding 
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amino acid 
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Amino acid segment containing signal peptide 
{A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H^Histidine, I=Isoleucine # K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y-Tyroeine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








TP I LQKKPSTEQPLYS S SLWG PAVDGCDC VAEGLWLPQLHVGDW 
LVFDNMGAYTVGMGS P F WGTQACHITYAMSRVAWEALRRQLMAA 
EQEDDVEGVCKPLSCGWEITDTLCVGPVFTPASIM 


6547 


1 


541 


LHSKYLAPALCSQPGMMRCCRRRCCCRQPPHAIiRPLLLLPLVIiL 
PPLAAAAAGPNRCDTI YQGFAE CLI RLGDSMGRGGELET XCRSW 
NDFHACAS QVLSGCPEEAAAVWE SLQQEARQAPRPNNLHTLCGA 
PVHVRERGTGS ETNQETLRATAPALPMAPAPPLLAAALALAYLL 
RPLA 


6548 


2 


219 


FVSRLSVRDVRFPTFLGGHGADAMHTDPDYSAAYVPIETDAEDG " 
IKGCGITFTW5KGTEVGELKILSRFQNA 


6549 


73 


1490 


ETGRVCEDARPACGSRSRRRRKEAAPGIPTPSPSSSSPTSSRPA 
ARAFSKAPARLSRPRAREEPPDPGRRYIQEE I IQARKHKLIKMC 
SS VAAKLWFLTDRR I RED YPQ KE I LRALKAKCCEEELDFRAWM 
DEWLT I EQGNLGLR I NGE L I TA Y PQ VWVR VPT P W VQS DS D I T 
VLRHLEKMGCRLMNRPQAILNCVNKFWTFQELAGHGVPLPDTFS 
YGGHENFAKM I DEAE VLE FPMWKNTRGHRGKAVFLARDKHHLA 
DLSHLIRHEAPYLFQKYVKESHGRDVRVIWGGRWGTMLRCST 
DGRMQSNCSLGGVGMMCSLSEQGKQLAIQVSNILGMDVCGIDLL 
MKDDGSFCVCEANANVGFIAFDKACNLDVAGI IADYAASLLPSG 
RLTRRMSLLSWSTASETSEPELGPPASTAVDNMSASSSSVDSD 
PES TERELLT KLPGGLFNMNQLLANE I KLLVD 


6550 


2293 


922 


FRVSRDGAPDCGIEQMGLAMEHGGSYARAGGSSRGCWYYLRYFF 
LFVS LIQFLII LGLVLFMVYGNVHVS TESNLQATERRAEGLYSQ 
LLGLTASQSNLTKELNFTTRAKDAIMQMWLNARRDLDRINASFR 
QCQGDRVI YTNNQRYMAAI I LSEKQCRDQFKDMNK3CDALL FML 
NQKVKTLE VE I AKEKT I CTKDKES VLLNKR VAEEQL VE CVKTRE 
LQHQERQLAKEQLQKVQALCLPLDKDKFEMDLRNLWRDS 1 1 PRS 
LDNLGYKLYHPI/3SELASIRRACDHMPSLMSSKVEEliARSLRAD 
I ERVARENS DLQRQKLEAQQGLRASQEAKQKVEKEAQAREAKLQ 
AECSRQTQLALEEKAVLRKERDNLAKELEEKKREAEQLRMELAI 
RNSALDTC I KTKSQPMMPVSRPMGPVPNPQP I DPASLEEFKRKI 
LESQRPPAGIPVAPSSG 


6551 


157 


748 


IQP PD PRNMTLAAYKE KMKE LPLVSLFCSCFLADPLNKS S YKYE 
ADTVDLNW C V I S DMEV I E LNKCTSGQS FE VI LKP P S FDG VP E FN 
ASLPRRRDPSLEE I QKKLEAAEERRKYQEAELLKHLAEKREHER 
EVI QKA I EENNNF I KMAKEKLAQKME SNKENREAHLAAMLERLQ 
EKDKHAEEVRKNKELKEEASR 


6552 


157 


748 | 


IQPPDPRNMTLAAYKEKMKELPLVSLFCSCFLADPLNKSSYKYE 
ADTVDLNWCVI SDMEVIELNKCTSGQS FE VILKP PS FDGVP EFN 
AS L PRRRD P S LE E I QKKLEAAEERRK YQEAEL LKHLAE KREHER 
EVIQKAI EENNNF I KMAKEKLAQKMESNKBNREAHLAAMLERLQ 
EKDKHAEEVRKNKELKEEASR 


6553 


2 


1807 


FVWSKMAAHLSYGRVNLNVLREAVRRELREFLDKCAGSKAIVWD 
E YLTGPFGLIAQYSLLKEHEVEKMFTLKGNRLPAADVKNI I FFV 
RPRLELMDIIAENVLSEDRRGPTRDFHILFVPRRSLLCEQRLKD 
LGVLGSFIHREEYSLDLIPFDGDLLSMESEGAFKECYIiEGDQTS 
L YHAAKG LMTLQ AL YG T I PQ I FG KGE CARQ VANMM I RM KRE FTG 
S QNS I FP VFDNLLLLDRNVDLLTPLATQLTYEGLI DE I YG I QNS 
YVKLPPEKFAPKKQGDGGKDLPTEAKJCLQLNSAEELYAEIRDKN 
FNAVGSVLSKKAKI ISAAFEERHNAKTVGE I KQFVSQLPHMQAA 
RGSLANHTSIAELII<DVTTSEDFFDKLTVEQEFMSG1DTDKVNN 
Y I EDC I AQKHSL I KVLRLVCLQSVCNSGLKQKVLD Y YKRE ILQT 
YGYEHILTLHNLEKAGLLKPQTGGRNNYPTIRKTLRLWMDDVNE 
QNPTD I S YVYSG YAPLS VRLAQLLSRPG WRS I EEVLR I LPGPHF 
E E RQPL PTGLQ KKRQPG ENRVTL I FFLGGVT FAE IAALRFLS QL 
EDGGTEYVIATTKLMNGTSWIEALMEKPF 



521 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
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nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 

amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D*=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W«Tryptophan, Y»Tyrosine, X*»Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\-poseible nucleotide insertion) 


6554 


119 


1244 


FEMGSQVSVESGALHWIVGGGFGGIAAASQLQALNVPFMLVDM 
KDS FHHNVAALRAS VETG FAKKT FIS YSVTFKDNFRQGLWG I D 
LKNQMVLLOGGEALPFSHLILATGSTGPFPGKFNEVSSQQAAIQ 
AYEDMVRQVQRSREIVWGGGSAGVEMAABIKTEYPEKEVTLIH 
S Q VALAD KE LL PS VRQE VKE I LLR KG VQLLLS ERVSNLE E LPLN 
EYREYIKVQTDKGTEVATNLVILCTGIKINSSAYRKAFESRLAS 
SGALRVNEHLQVEGHSNVYAIGDCADVRTPKMAYLAGLHANIAV 
AN I VNS VKQRPLQAYKPGALTFLLSMGRNDGVGQI SG FYVGRLM 
VRLTKS RDL FVS TS WKTMRQS P P 


6555 


1552 


498 


I HMALLR K I NQ VLL FLLI VTLC V I L YKKVHKGTVP KND AD D ES E 
TPEELEEE I P W I CAAAGRMGATMAAI NS I Y S NTDAN I L F YWG 
LRNTLTRIRKWIEHSKLREINFKIVEFNPMGLKGKIRPDSSRPE 
LLQ P LNFVR FYL P LL I HQHE KVI YLDDD V I VQGD I QE L YDTTLA 
LGHAAAFSDDCDLPSAQDINRLVGLQNTYWGYLDYRKKAIKDLG 
T P <? TP FWP ft V T VATJMTF W KHOR T TKOLE KWMQ KNVEENL YS S 
SLGGGVATSPMLIVFHGKYSTINPLWHIRHLGWNPDARYSEHFL 
QEAKLLHWNGRHKPWDFPSVHNDLWESWFVPDPAGIFKLNHHS 


6556 


241 


1449 


ASLCKGCFFVTHVLVIILPSLQSPPTFGFLLDIDGVLVRGHRVI 
PAALKAFRRLVNSQGQLRVPWFVTNAGNILQHS KAQELS ALLG 
CEVDADQVILSHSPMKLFSEYHEKRMLVSGQGPVMENAQGLGFR 
NWTVDE LRMAFP LLDMVDLERRLKTTP LPRND F PR I EG VLLLG 
EPVRWETSLQLIMDVLLSNGSPGAGLATPPYPHLPVLASNMDLL 
WMAE AKMPR FGHGT FLLCLET I YQ KVTGKELR YEGLMGKP S I LT 

vnv fiT?nTiTT3T3ARI?PT? RWRRP TRKT i Y A VG DNPM <3 D VYG ANLFHO Y 
LQKATHDGAP E LGAGGTRQQQ PS AS Q S C I S I LVCTGVYNPRNPQ 
STEPVLGGGEPPFHGHRDLCFSPGLMEASHWNDVNEAVQLVFR 
KEG WALE 


6557 


2598 


1534 


RMCGRTSCHLPRDVLTRACAYQDRRGQQRLPEWRDPDKYCPSYN 
KSPQSNSPVLLSRLHFEKDADSSERIIAPMRWGLVPSWFKESDP 
SKLQ FNTTNCRSDTVMEKRS FKVPLGKGRRCWLADGFYEWQRC 
QGTNQRQPYFIYFPQIKTEKSGSIGAADSPENWEKVWDNWRLLT 
MAGI FDCWEPPEGGDVLYSYTI ITVDSCKGLSDIHHRMPAILDG 
E E A VS KWLDFG E VS TQEALKL I HP TENI T FHAVS S WNNSRNNT 
PECLAPVDLWKKELRASGSSQRMLQWLATKSPKKEDSKTPQKE 
ESDVPQWSSQFLQKSPLPTKRGTAGLLEQWLKREKEEEPVAKRP 
YSQ 


6558 


21 


1138 


FHGRRRGGRKMELGS CLEGGREAAEEEGE PEVKKRRLLCVE FAS 
VASCDAAVAQCFLAENDWEMERALNSYFEPPVEESALERRPETI 
SEPKTYVDLTNEETTDS'TTSKISPSEDTQQENGSMFSLITWNID 
GLDLNNLS ERARGVCS YLALYS PDVX FLQEVI PP YYS YLKKRSS 
NYEIITGHEEGYFTAIMLKKSRVKLKSQEIIPFPSTKMMRNLLC 
VHVNVSGNELCLMTSHLESTRGHAAERMNQLKMVLKKMQEAPES 
ATVIFAGDTNLRDREVTRCGGLPNNIVDWEFLGKPKHCQYTVfD 
TQMNSNLG ITAACKLRFDRI FFRAAAEEGHI I PRSLDLLGLEKL 
DCGRF P S DHWGLLCNLD 1 1 L 


6559 


3 


364 


GPELSGLPTRPKKLKANQTPIAMDCCASRSCSVPTGPATTICSS 
DKSC^CGVCLPSTCPHTVVILLEPTCCDNCPPPCHIPQPCVPTCF 
LLNSCQPTPGLETLNLTTFTQPCCEPCLPRGC 


6560 


3 


1435 


TATSGGIWLRRKWRCHWPRPLPQSCVGTEGGLQVRDTSSRIAKG 
G VDHT KMS LHG ASGGHERSRDRRRS S DRS RDS S HERTE S QLTP C 
IRNVTS PTRQHHVEREKDHSSSRPSS PRPQKASPNGSI S SAGNS 
SRNS SQS S S DGS CKTAGEMVFVYENAKEGARN r RTSERVTLIVD 
NTRFWDPS I FTAQPNTMLGRMFGSGREHNFTRPNEKGEYEVAE 
GIGS TVFRA I LD YY KTG 1 1 RCPDG I S I P ELREACDYLC I S F EYS 
TIKCRDLSALKHELSNDGARRQFEFYLEEMILPLMVASAQSGER 
ECHIWLTDDDWDWDEEYPPQMGEEYSQIIYSTKLYRFFKYIE 
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Predicted end 
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location 
corre sponding 
to first 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aepartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G-Qlycine, 
H-Histidine, I-Isoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W«=Tryptophan , YsTyrosine, X=UnJcnown, *«Stop 
Codon, /^passible nucleotide deletion, 
\=possible nucleotide insertion) 








NRDVAKSVLKERGLKKIRLGIEGYPTYKEKVKKRPGGRPBVIYN 
YVQRPFIRMSWEKEEGKSRHVDFQCVKSKSITNLAAAAADIPQD 
QLWMHPTPQVDELDILPIHPPSGNSDLDPDAQNPML 


6561 


3 


1086 


PGRRFRRKESSSSRWFPADCLLGLRGPASSLLSPEPSPSWPSHS 
P C PMAALTDLS FM YRW F KN CNLVGNLS E KYVF I TGCDSG FGNL L 
AKQL VDRGMQ VLAAC FTEEGSQ KLQRDTS YRLQTTLLD VT KS E S 
I KAAAQWVRDKVGECK3LWALVNNAGVGLPSGPNEWLTKDDFVKV 
INVNLVGLIEVTLHMLPMVKRARGRWNMSSSGGRVAVIGGGYC 
VSKFGVEAFSDSIRRELYYFGVKVCIIEPGNYRTAILGKENLES 
RMRKLWERLPQETRDSYGEDYFRIYTDKLKNIMQVAEPRVRDVI 
NSMEHAIVSRSPRIRYNPGLDAKLLYIPLAKLPTPVTDFILSRY 
LPRPADSV 


6562 


1 


1562 


MSTLYDIRAHKAQLLRFFASSDSNKALEQRRTLHTPKLEHLDRV 
LYEWFLGKRSEGVPVSGPMLIEKAKDFYEQMQLTEPCVFSGGWL 
WRFKARHGIKKLDASSEKQSADHQAAEQFCAFFRSLAAEHGLSA 
EQVYNADETGLFWRCLPNPTPEGGAVPG P KQGKDRLTVLMCANA 
TGSHRLKPLAIGKCSGPRAFKGIQHLPVAYKAQGNAWVDKE I FS 
DWFHHIFVPSVREHFRTIGLPEDSKAVLLLDSSRAHPQEAELVS 
SNVFTIFLPASVASLVQPMEQGIRRDFMRNFINPPVPLQGPHAR 
YNMNDAI FS VACAWNAVP SHVFRRAWRKL W P S VAFAEGSSSE E E 
LEAECFPVKPHNKSFAHILELVKEGSSCPGQLRQRQAASWGVAG 
REAEGGRPPAATSPAEWWSSBKTPKADQDGRGDPGEGEEVAWE 
QAAVAFDAVLR FAERQ P C FS AQ E VGQLRALRAVFRS QQQVRRRR 
G ALGAWKVEALQEG PGGCGATAQ S PLP CS S TAGDN 


6563 


1319 


2694 


LARPAQPVLLREPEGAGPPVPAGHLVHHLQGGHLRERAHPDLEA 
HEH PLP CDQMFWRQMGGHLRMVEANS RGWWG I G YDHTAWVYTG 
G YGGG CFQG LAS S TSNI YTQS D VKCVHI YENQRWN P VTG YTSRG 
LPTDRYMWSDASGLQECTKAGTKPPSLQWAWVSDWFVDFSVPGG 
TDQEGWQYASDFPASYHGSKTMKDFVRRRCWARKCKLVTSGPWL 
EVPPIALRDVS I IPESPGAEGSGHS I ALWAVS DKGD VLCRLGVS 
ELNPAGSSWLHVGTDQPFASIS IGACYQVWAVARDGSAFYRGSV 
Y PS Q P AGDC W YH I PS P PRQRLKQ VS AGQTS VYALDENGNLW YRQ 
GITPSYPQGSSWEHVSNNVCRVSVGPLDQVWVrANKVQGSHSLS 
RGTVCHRTGVQ PHEPKGHGWD YG I GGGWDHI S VRANATRAPRSS 
SQEQEPSAPPEAHGPVCC 


6564 


1 


975 


APGS CALWS YCGRGWSRAMRGCQLLGLRSSWPGDLLS ARLLSQE 
KRAAE THFG FETVS EEE KG G KVYQ V FES VAKKYDVMNDMMS LG I 
HRVWKDLLLWKMHPLPGTQLLDVAGGTGDIAFRFLNYVQSQHQR 
KQKRQLRAQQNLSWEEIAKEYQNEEDSLGGSRVVVCDINKEMLK 
VGKQKALAQGYRAGLAWVLGDAEELP FDDDKFD I YTIAFGIRNV 
THIDQALQEAHRVLKPGGRFLCLEFSQVNNPLISRLYDLYSFQV 
IPVLGEVIAGDWKSYQYLVESIRRFPSQEEFKDMIEDAGFHKVT 
YESLTSGIVAIHSGFKL 


6565 


1464 


999 


RSAVANGLTKRRMGLKLNGR Y ISLI LAVQIAYLVQAVRAAGKCD 
AVFKGFSDCLLKI^DSMANYPQGLDDKTNIKTVCTYWEDFHSCT 
VTALTDCQEGAKDMWDKLRKESKNLNIQGSLFELCGSGNGAAGS 
LLP AF PVLLVS LS AALATWLS F 


6566 


3 


1385 


KYESAQPGGTQPEPGLGARMAIHKALVMCLGLPLFLFPGAWAQG 
HVPPGCSQGLNPLYYNLCDRSGAWG I VLEAVAGAG I VTTFVLTT 
ILVASLPFVQDTKKRSLLGTQVFFLLGTLGLFCLVFACVEKPDF 
STCASRRFLFGVLFAI CFS CLAAHVFALNFLARKNHGPRGWVI F 
TVALLLTLVEVI INTEWLI ITLVRGSGEGGPQGNSSAGWAVASP 

LLTTATS VAI WWW I VMYTYGNKQHNS PTWDDPTLAIALAANAW 
AFVLFYVIPEVSQVTKSSPEQSYQGDMYPTRGVGYETILKEQKG 
QSMFVENKAFSMDEPVAAKRPVSPYSGYNGQLLTSVYQPTEMAL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C-Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine / T=Threonine, V=valine / 

Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








MHKVPSEGAYDIILPRATANSQVMGSANSTLRAEDMYSAQSHQA 
ATPPKDGKNSQVFRNPYVWD 


6567 


125 


863 


TKRSNLKAYACSIHHIRTMSYVFVNDSSQTNVPLLQACIDGDFN 
YSKRLLESGFDPNIRDSRGRTGLHLAAARGNVDICQLLHKFGAD 
LLATD YQGNTALHLCG HVDT I Q FL VS NGLK I D I CNHQGAT P LVL 
AKRRGVNKDVIRLLESLEEQEVKGFNRGTHSKLETMQTAESESA 
ME SHS LLN PNLQQGEG VLS S FRTT WQ E FVEDLGFWRVL LL I FVI 
ALLSLGIAYYVSGVLPFVENQPELVH 


6568 


3 


1183 


HASDRLLVLPDNYSHFSQ^SANLQGPSRTTELFHPTLASISSPM 
LEGAELYFNVDHGYLEGLVRGCKASLLTCX3DYINLVQCETLEDL 
KI HLQTTDYGNFLANHTNPLTVS KI DTEMRKRLCGEFE YFRNHS 
LE P LS T FLT YMT CS YM I DNV I LLMNGALQKKS VKE ILG KCHP LG 
RFTEMEAVNIAETPSDLFNAILIETPLiAPFFQDCMSENALDELN 
I E LLRNKL YKS YLEAFY KF CKNHGDVT AE VMC P I LEFEADRRAF 
IITLNSFGTELSKEDRETLYPTFGKIjYPEGLRLLAQAEDFDQMK 
NVADHYGVYKPLFEAVGGSGGKTLEDVFYEREVQMNVLAFNRQF 
HYGVFYAYVKLKEQEIRNIVWIAECISQRHRTKINSYIPIL 


6569 


205 


1532 


rrrgpqrlgkgrptpllcrwrtagpshwekqarafqglrpvdpr 
rmswlfpltksasssaagspggltslqqqkqrlieslrnshssi 

AEIQKDVEYRLPFTINNLTININILLPPQFPQEKPVISVYPPIR 
HHLMDKQGVYVTSPLVNNFTMHSDLGKI IQSLLDEFWKNPPVLA 
PTSTAFPYLYSNPSGMSPYASQGFPFLPPYPPQEANRSITSLSV 
ADTVSS STTSHTTAKPAAPS FGVLSNLPLPI PTVDAS I PTSQNG 
FGYKMPDVPDAFPELSELSVSQLTDMNEQEEVLLEQFLTLPQLK 
QIITDKDDLVKSIEELARKNLLLEPSLEAKRQTVLDKYELLTQM 
KSTFEKKMQRQHELSESCSASALQARIjKVAAHEAEEESDNIAED 
FLEGKME I DDFLSS FMEKRTI CHCRRAKEEKLQQAIAMHSQ FHA 
PL 


6570 


330 


1304 


ARLPRLTFLREGFLYVLLSHWVFVGAPRPPASDSWKKGLVPSAP 
PASRKMGS KALPAP I PLHPSLQLTNYS FLQAVNT F PATVDHLQG 
LYGLSAVQTMHMNHWTLGYPNVHEITRSTITEMAAAQGLVDARF 
P FP AL P FTTHL FH P KQGA I AHVL PALH KDR PR FD FANLAVAATQ 
EDPPKMGDLSKLSPGLGSPISGLSKLTPDRKPSRGRLPSKTKKE 
FICKFCGRHFTKSYNIiLIHERTHTDERPYTCDICHKAFRRQDHL 
RDHR Y I HS KEKP FKCQECGKGFCQSRTLAVHKTLHMQTS S PTAA 
SSAAKCSGETVICGGT 


6571 


169 


656 


APDMNRKKLQKLTDTLTKNCKHLFRGFDKDNDGCVNVLEWIHGL 
SLFLRGSLEEKMKYCFEVFDLNGDGFISKEEMFHMLKNSLLKQP 
S EEDPDEG I KDLVE I TLKKMDHDHDGKLS FADYELAVREETLLL 
E AFG P CL P D PKSQME FEAQ VFKD PNE FNDM 


6572 


49 


1^46 


T P ERAQ PG ALLGAAGCC VCGGRW W PRSHERG YFS S AKMGS KRRN 
LS CSERHQKLVDBNYCKECLHVQALKNVNSQ IRNQMVQNENDNRV 
QRKQFLRLLQNEQFELDMEEAIQKAEENKRLKELQLKQEEKLAM 
ELAKLKKESLKDEKMRQQVRENSIELRELEKKLKAAYMNKERAA 
Q I AEKDAI KYEQMKRDAE IAKTMMEEHKRI I KEENAAE D KRN KA 
KAQYYLDLEKQLEEQEKKKQEAYEQLLKEKLMIDEIVRKIYEED 
QLEKQQKLEKMNAMRRYIEEFQKEQALWRKKKREEMEEENRKII 
E FANMQQQR E E DRMAKVQENEE KRLQLQNALTQKLE EM LRQRED 
LEQVRQELYQEEQAEIYKSKLKEEAEKKLRKQKEMKQDFEEQMA 
LKELVLQAAKEEEENFRKTMLAKFAEDDR I ELMNAQKQRMKQLE 
HRRAVEKLIEERRQQFLADKQRELEEWQLQQRRQGFINAIIEEE 
RLKLLKEHATNLLGYLPKGVFKKEDDIDLLGEEFRKVYQQRSEI 
CEEK 


6573 


767 


275 


GGGGGESQS FRAQDGTRTPATDCLMYIXX3PRKIjMrO<X5YDMVQK 
LFLDFFRRRLSQRPTAEELEQRNILKPRNEQEEQEEKREIKRRL 
TRKLSQRPTVEELRERKILIRFSDYVEVADAQDYDRRADKPWTR 
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SEQ 
ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

cUltJLIKJ nLlu 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 

ICO 1UUC Ui. 

amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine , K- Lysine, 
LsLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R*Arginine, 
S=Serine, T=Threonine , V=Valine, 
WsTryptophan, Y*Tyrosine, X -Unknown , *-Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








LTAADKVSRGECWRVGGRTVCWVSLGSPIiGSV 


6574 


204 


1159 


LESSVPVSVGVFWACGVSWTGAAGLQDGALSDTMARNAEKAMTA 

KVAQ I QNAGLGE PR I RDLNDE INKLLREKGHWEVRIKELGGPDY 
GKVGPKMLDHEGKEVPGNRGYKYFGAAKDLPGVRELFEKEPLPP 
PRKTRAE LMKAI D F E Y YG YLD EDDG V I VPLEQE YE KKLRAEL VE 
KWKAEREARLARGEKEEEEEEEEEINIYAVTEEESDEEGSQEKG 
GDDSQQKFIAHVPVPSQQEIEEALVRRKKMELLQKYASETLQAQ 
SEEARRLLGY 


6575 


117 


820 


S PALASQSGGI TEEKMLEPQENGVI DLPDYEHVEDETFPPFPPP 
ASPERQDGEGTEPDEESGNGAPVPVPPKRTVKRNIPKLDAQRLI 
S E RGIjjP AJjKH V r U KAK F KG KGH E AEDJ-iKMIj 1 KHMhnWAnK.LirPK 
LQFEDFIDRVEYLGSKKEVQTCLKRIRLDLPILHEDFVSNNDEV 
AENNEHD VTS TELDP FLTNLS ES EMFASELS I S LTEEQQQRIER 
NKQIiALERRQAKLP 


6576 


1 


1060 


PEPQALVGQKRGALRIiLVARLVLTVSAPAEVRRRVLRPVLSWMD 
RETRALADSHFRGLGVDVPGVGQAPGRVAFVSEPGAFSYADFVR 
GFLLPNLPCVFSSAFTQGWGSRRRWVTPAGRPDFDHLLRTYGDV 
WPVANCGVQEYNSNPKEHMTLRDYITYWKEYIQAGYSSPRGCL 
YLKDWHLCRDFPVEDVFTLPVYFSSDWLNEFWDALDVDDYRFVY 
AGPAGSWSPFHADIFRSFSWSVNVCGRKKWLLFPPGQEEALRDR 

tj/tktt nVniTTCDM (T^iTUT UDOWAT » C DDT 17 TTHI? ftf TTknnPVDOr"' 

HGNJbPXIJV 1 br'AljCDrHLHPKIvyjjAVjFi'JjlSl AUfcfcAljbMVr V f fc>Lr 
WHHQVHNLVMCCFSCPLSGAFJjQEDGSTTSPLSQPELGWNGVAH 
G 


6577 


2271 


9B7 


S DRMASDD F D I V I EAMLE AP Y KKE EDEQQRKE VKKD Y PS NTTS S 
TSNSGNETSGSSTIGETSNRSRDRDRYRRRNSRSRSPGRQCRHR 
SRS WDRRHG S E S R SRDH RREDR VH YRSPPLATG YR YGHS KS PHF 
REKSPVREPVDNLSPEERDARTVFCMQLAARIRPRDLEDFFSAV 
GKVRDVRI I SDRNSRRSKGIAYVEFCEIQSVPLAIGLTGQRLLG 
VP I IVQASQAEKNRIiAAMANNLQKGNGGPMRLYVGSLHFNITED 

EQLNGFELAGRPMRVGHVTERLDGGTDITFPDGDQELDLGSAGG 
RFQLMAKLAEGAGIQLPSTAAAAAAAAAAQAAALQLNGAVPLGA 
LNPAALTALSPALNLASQCLQLSSLFTPQTM 


6578 


377 


1489 


PS S S ATMNRAPLKRATI LHMALTGAS DPSAEAEANGEKP FLLRA 
LQIALWSLYWVTSISMVFLNKTLIjDSPSLRLDTPIFVTFYQCL 

MITFNNLCLKYVGVAFYNVGRSLTTVFNVLLSYLLLKQTTSFYA 
LLTCGI I IGGFWLGVDQBGAEGTLSWLGTVFGVLASLCVSLNAI 
YTTKVLPAVDGS I WRLTFYNNVNAC I LFLPLLLLLGELQALRDF 
AQLGSAHFWG^4MTLGGLFGFAIGYVTGLQI KFTS PLTHNVSGTA 
KACAQTVLAVL YYEETKS FLW WTSNMMVLGGS SAYTWVRGWEMK 
KTPEEPS PKDSEKSAMGV 


6579 


2 


711 


RPPRVWYPELRELSAAAPRWSHRTAPGIMVFYFTSSSVNSSAYT 
I YMGKDKYENEDL I KHGWPED I WFHVDKLSSAHVYLRLHKGENI 
EDI PKEVLMDCAHLVKANS IQGCKMNNVNWYTPWSNLKKTADM 
DVGQIGFHRQKDVKIVTVBKKVNEILNRLEKTKvERFPDLAAEK 
ECnU)REERNEKKAQlQEMKKREKEEMKKKP>EMDELRSYSSLMKV 
ENMSSNQDGNDSDEFM 


6580 


62 


1571 


LVALKNWKPKGTNI PAPQSPVFGEAVSGVYMMTKVLGMAPVLGP 
RPPQEQVGPIJ^VKVEEKEEKGICYLPSLEMFRQRFRQFGYHDTPG 
PREALSQLRVLCCEWLRPEIHTKEQILELLVLEQFLTILPQELQ 
AWVQEHCPESAEEAVTLLEDLERELDEPGHQVSTPPNEQKPVWE 
KISSSGTAKESPSSMQPQPLETSHKYESWGPLYIQESGEEQEFA 
QDPRKVRDCRLSTQHEESADEQKGSEAEGLKGDI ISVI IANKPE 
ASLERQCVNLENEKGTKPPIiQEAGSKKGRESVPTKPTPGERRYl 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C-Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glyc±ne, 
H^Histidine, I-Isoleucine , K=Ly8ine, 
L=Leucine, M=Methionine, N=Asparagine, 
P»Proline, Q=Glutamine, R^Arginine, 
S=Serine f T*Threonine, V=Valine, 
W=Tryptophan f Y*= Tyrosine, X^Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








CAECGKAFSNSSNLTKHRRTHTGEKPYVCTKCGKAFSHSSNLTL 
HYRTHLVDRPYDCKCGKAFGQSSDLLKHQRMHTEEAPYQCKDCG 
KAFSGKGSLIRHYRIHTGEKPYQCNECGKSFSQHAGLSSHQRLH 
TGEKPYKCKECGKAFNHSSNFNKHHRIHTGEKPYWCHHCGKTFC 
SKSNLSKHQRVHTGEGEAP 


6581 


228 


476 


RVFLKDLSSTPMASNNTAS I AQARKLVEQLKMEAN IDRIKVSKA 
AADLMAYCEAHAKEDPLLTPVPASENPFRBKKFFCAIL 


6582 


1428 


718 


C FTTKTHCS PVSVPYLS PLVLRKELES LLENEGDQV I HTSSFIN 
QHPIIFWTLVWYFRRLDLPSNLPGLILTSEHCNEGVQLPLSSLS 
QDSKLVYIQLLWDNINLHQEPREPLYVSWRNFNSEKKSSLLSEE 
QQETSTLVET IRQSI QHNNVLKP I NLLSQQMKPGMKRQ RS LYRE 
I LFLSLVSLGREN I D I EAFDNEYG I AYNS LSSEILERLQKIDAP 
PSASVEWCRKCFGAPLI 


6583 


487 


41 


RI FSMTSGRLRWRCTWRPATALWSASLRLiGTSSMHPS PRSISLP 
LS MMLSP LPSNTRGLS PTALFRS PDS EHATS CPRLHLWR CRAPL 
RS PS PLGRLQVLPRS PLHVHTHNSGKEVUJLQVQRS RSGTGPAC 
SQAGSGAVQGGNWCI F 


6584 


189 


1750 


PLPMAALGPS SQNVTE YWRVP KNTTKKYN I MAFNAADKVNFAT 
WNQARLERDLSNKKIYQEEEMPESGAGSEFNRKLREEARRKKYG 
I VLKE FRPEDQP WLLRVNGKSGRKFKG IKKGGVTENTS YYI FTQ 
CPIX3AFEAFPVHNWYNFTPLARHRTLTAEEAEEEWERRNKVLNH 
FSIMQQRRLKDQDQDEDEEEKEKRGRRKASELRIHDLEDDLEMS 
SDASDASGEEGGRVPKAKKKAPLAKGGRKKKKKKGSDDEAFEDS 
DDGDFEGQEVDYMSDGSSSSQEEPESKAKAPQQEEGPKGVDEQS 
DSSEESEEEKPPEEDKEEEEEKKAPTPQEKKRRKDSSEESDSSE 
ESDIDSEASSAFFMAKKKTPPKRERKPSGGSSRGNSRPGTPSAE 
GGS TS STLRAAAS KLE QGKR VS BMPAAKRLRLDTG PQS LS G KST 
PQPPSGKTTPNSGDVQVTEDAVRRYLTRKPMTTKDLLKKFQTKK 
TGLSS EQTVNVLAQIIjKRLNPERKM indkmhfslke 


6585 


3 


1678 


GP I RNSRIDDFVGGDPRAEASCS VLHS KPHAMADSRDPASDQMQ 
HWKEQRAAQKADVLTTGAGNPVGDKLNVITVGPRGPLLVQDVVF 
TDEMAHFDRERIPERWHAKGAGAFGYFEVTHDITKYSKAKVFE 
HIGKKTPIAVRFSTVAGESGSADTVRDPRGFAVKFYTEDGNWDL 
VGNNTPIFFIRDPILFPSFIHSQKRNPQTHLKDPDMVWDFWSLR 
P ES LHQVS FL FS D RG I PDGHRHMNG YG S HTFKL VNANGE AVYCK 
FHYKTDQGIKNLSVEDAARLSQEDPDYGIRDLFNAIATGKYPSW 
TFYIQVMTFNQAET FP FNP FDLTKVWPHECDYPLIPVGKLVLNRN 
P VNYFAEVEQ I AFDPSNMP PG I EAS PDKMLQGRLFAYPDTHRHR 
LGPNYLH I PVNCP YRARVANYQRDG PMCMQDNQGGAPNY YPNS F 
GAPEQQPSALEHSIQYSGEVRRFNTANDDNVTQVRAFYVNVLNE 
EQRKRLCENIAGHLKDAQI FIQKKAVKNFTEVHPDYGSHIQALL 
DKYNAEKPKNAIHTFVQSGSHLAAREKANL 


6586 


32 


804 


PLPEQPAESTSTMPVSGTPAPNKKRKSSKLIMELTGGGQESSGL 
NLGKKISVPRDVMLEELSLLTNRGSKMFKLRQMRVEKFIYENHP 
DVFSDSSMDHFQKFLPTVGGQLGTAGQGFSYSKSNGRGGSQAGG 
SGSAGQYGSDQQHHLiGSGSGAGGTGGPAGQAGRGGAAGTAGVGE 
TGSGDQAGGEGKH I TVFKT Y I S PWERAMGVDPQQKMELG I DLLA 
YGAKAELPKYXS FNRTAMP YGGYEKASKRMTFQMPKV 


6587 


75 


1117 


RRVPSLGKMPECWDGEHDIETPYGLLHWIRGSPKGNRPAILTY 
HD VGLNHKLCFNT F FNFEDMQE I TKH FWCHVDAPGQQ VGAS Q F 
POGYQFPSMEQLAAMLPSVVQHFGFKYVIGIGVGAGAYVLAKFA 
L I FPDLVEGLVLVNIDPNGKGWI DWAATKLSGLTSTLPDTVLSH 
L F S QE E LVNNTE LVQ S YRQQ I GNWNQ ANLQLFWNM YNS RRDLD 
INRPGTVPNAKTLRC PVMLWGDNAPAEDGWECNS KLDPTTTT 
FLKMADSGGLPQVTQPGKLTEAFKYFLQGMGYMPSASMTRLARS 
RTASLTS AS S VDGSR PQACTHS ESSEGLGQVNHTMEVS C 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine , G=Glycine, 
H»Histidine, I=Isoleucine , K»Lysine, 
L=Leucine, M-Methionine, N=Asparagine , 
PsProline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, v=Valine, 
W= Tryptophan, YsTyrosine, X= Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6588 


137 


501 


LGLQAQLLELRTNNYQLSDELRKNGVELTSLRQKVAYLDKEFSK 
AQ KALS KS KKAQE VE VLLS E NEMLQAKLHSQ EEDFRLQNS TLMA 
EFSKLCSQMEQLEQENQQLKEGAAGAGVAQAGP 


6589 


2 


1405 


RPWGSAMATFSRQEFFQQLLQGCLLPTAQ^GLDQIWLLLAICLA 
CRLLWRLGLPSYLKHASTVAGGFFSLYHFFQLHMVWVVLLSLLC 
YLVLFLCRHS SHRGVFLSVTI LI YLLMGEMHMVDTVTWHKMRGA 
QM I VAMKAVSLGFDLDRGEVGTVPSP VEFMG YLYFVGTI VFGP W 
ISFHSYIiQAVOGRPLSCRWLQKVARSLALALLCLVLSTC^PYL 
FP Y FI PLNGDRLLRNKKRKARGTMVRWLRAYES AVS FHFSNYFV 
GFLSEATATLAGAGFTEEKDHLEWDLTVSKPLNVELPRSMVEW 
TS WNLPMS YWLNNYVFKNALRLGTFSAVLVT YAAS ALLHGFS FH 
LAAVLLS LAF ITYVEHVLRKRLAR ILS ACVLS KRCP PDCSHQHR 
LG LGVRALNL LFGALA I FHLAYLG S L FDVDVDDTTEEQG YGMA Y 
TVHKWSELSWASHWVTFGCWIFYRLIG 


6590 


2177 


656 


VRAYEHVLS LLENVFTPMFCHRDE YFRQLLRGAES PTRNS KLNR 
GSLSLDDFRNTQKRGESFGISRIGSKIKGVFKSTTMEGAMLPNY 
GVAEGEDDFIEEGIWMEDDSPVEAVSTPNTPRNLAAWKISIPY 
VDFFEDPSSERKEKKER I PVFCIDVERNDRRAVGHEPEHWS VYR 
RYLEFYVLESKLTEFHGAFPDAQLPSKRIIGPKNYEFLKSKREE 
FQ E YLQKLLQH P ELSNS QLLAD FL S PNGGETQ FLDK I LP D VNLG 
KIIKSVPGKLMKEKGQHLEPFIMNFINSCESPKPKPSRPELTIL 
S PT S ENNKKLFNDL FKNNANRAENT ER KQNQNY FME VMTVEG VY 
DYLMYVGRWFQVPDWLHHLLMGTRILFKNTLEMYTDYYLQCKL 
EQLFQEHRLVSLITLLRDAIFCENTEPRSLQDKQKGAKQTFEEM 
MNYI PDLLVKCIGEETKYES I RLLFDGLQQP VLNKQLTYVLLDI 
VIQELFPELNKVQKEVTSVTSWM 


6591 


2177 


656 


VRAYEHVLS LLENVFTPMFCHRDE Y FRQLLRGAES PTRNS KLNR 
G S LS L DDFRNTQKRGE S FG I S R I GS KI KG VF KS TTMEG AML PN Y 
GVAEGEDDFIEEGIWMEDDSPVEAVSTPNTPRNLAAWKISIPY 
VDF FED PS S ERKEKKERI PVFCIDVERNDRRAVGHEPEHWS VYR 
RYLEFYVLESKLTEFHGAFPDAQLPSKRIIGPKNYEFLKSKREE 
FQ B YLQ KLLQHPELSNS QLLAD FLS PNGGE TQFLDK I L PD VNLG 
KIIKSVPGKLMKEKGQHLEPFIMNFINSCESPKPKPSRPELTIL 
SPTSENNKKLFNDLFKNNANRAENTERKQNQNYFMEVMTVEGVY 
DYLMYVGRWFQVPDWLHHLLKGTRILFKNTLEMYTDYYLQCKL 
EQLFQEHRLVSLITLLRDAIFCENTEPRSLQDKQKGAKQTFEEM 
MNY I PDLLVKCIGEETKYES I RLLFDGLQQP VLNKQLTYVLLD I 
VIQELFPELNKVQKEVTSVTSWM 


6592 


3 


1861 


APEFLGSTISSGSMIDANLKLLQEAEQRLKAIVAEKFAIATKEG 
DLPQVERFFK I FPLLGLHEEGLRKFSEYLCKQ VAS KAEBNLLMV 
LGTDMS DRRAAV I FADTLTLLFEG I AR IVETHQP I VETYYGPGR 
LYTL I KYLQVECDRQVEKWDKF I KQRD YHQQFRHVQNNLMRNS 
TTEK IEPRELDPI LTEVTLMNARS E L YLRFLKKR I S S DF E VGDS 
MASEEVKQEHQKCLDKLLNNCLLS CTMQELIGLYVTMEEYFMRE 
TVNKAVALDTYEKGQLTSSMVDDVFYI VKKCIGRALS SSS IDCL 
CAMINIiATTELESDFRDVLCNKLRMGFPATTFQDIQRGVTSAVN 
IMHS S LQQGKFDTKGIESTDEAKMS FLVTLNNVEVCS EN I STLK 
KTLES DCTKLFS QG IGGEQAQAKFDS CLSDLAAVSNKFRDLLQE 
GLTELNSTAIKPQVQPWINSFFSVSHNIEEEEFNDYEANDPWVQ 
QFILNLEQQMAE FKASLSP VI YDSLTGLMTS LVAVELEKWLKS 
TFNRLGGLQFDKELRSL IAYLTTVTTWT I RDKFARLSQMATI LN 
LERVTE I LDYWGPNS GPLTWRLTPAEVRQVLALRIDFRSED I KR 
LRL 


6593 


3 


1837 


EAFSAGSRRRGLALQRGVLGGLGGYCPCCCRRRGRLLVLLLLVR 
RGGEGGGGRGRGDKRRRRQARRQRRRPEPAEARGGKMADVLSVL 
RQYN I QKKE I WKGDEVI FGE FS WPKNVKTNYWWGTGKEGQPR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AcAlanine, C=Cysteine / D-Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine , K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T*Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








EYYTLDS ILFLLNNVHLSHPVYVRRAATENI P WRRPDRKDLLG 
YLNGEASTSASIDRSAPLEIGLQRSTQVKRAADEVLAEAKKPRI 
EDEECVRLDKERLAARLEGHKEG I VQTEQ I RSLSEAMS VEKI AA 
I KAKIMAKKRSTI KTDLDDDITALKQRS FVDAEVDVTRDIVSRE 
RVWRTRTTILQSTGKNFSKNIFAILCjSVKAREEGRAPEQRPAPN 
AAPVDPTLRTKQPIPAAYNRYDQERFKGKEETEGFKIDTMGTYH 
GMTLKSVTEGASARKTQTPAAQPVPRPVSQARPPPNQKKGSRTP 
I IIIPAATTSLITMLNAKDLIiQDLKFVPSDEKKKOGCQRENETIj 
I QRRKDQMQ PGGTAI S VT VP YR WDQ P L KLM PQDWDR WAVFVQ 
GPAWQFKGWPWLLPDGSPVDIFAKIKAFHLKYDEVRLDPNVQKW 
DVTVLELSYHKRHLDRPVFLRVWETLDRYMVKHKSHLRF 


6594 


1 


1096 


EFPGRRFRGSQASPLCATCGPALLRAPTRAAMTRSLFKGNFWSA 
DILSTIGYDNIIQHLNNGRKNCKEFEDFLKERAAIBERYGKDLL 
NLSRKKPCGQSEINTLKRALEVFKQQVDNVAQCHIQLAQSLREE 
ARKMEEFREKQKIiQRKKTEL I MDAIHKQKSLQFKKTMDAKKNYE 
Q KCRD KDEAEQ AVSRS ANLVN P KQQ EKLFVKLATS KTAVEDSDK 
AYMLHIGTLDKVREEWQSEHIKACEAFEAQECERINFFRNALWL 
HVNQLSQQCVTSDEMYEQVRKSLEMCS IQRDI EYFVNQRKTGQI 
PPAPIMYENFYSSQKNAVPAGKATGPNLARRGPLPIPKSSPDDP 
NYSLVDDYSLLYQ 


6595 


57 


781 


PLGTMSDSDLGEDEGLLSLAGKRKRRGNLPKESVKILRDWLYLH 
R YNA YP S EQE KLSLS GQ TNLS VLQ I CNWFINARRRLLPDMLRKD 
GKDPNQFTI SRRGGKASDVALPRGS S PSVLAVSVPAPTNVLSLS 
VCSMPLHSGQGEKPAAP F PRGELES PKPLVTPG STLTLLTRAEA 
GSPTGGLFNTPPPTPPEQDKEDFSSFQLLVEVALQRAAEMELQK 
QQDPSLPLLHTPIPLVSENPQ 


6596 


2 


1026 


PRLPVRRYHGRRRLQGRSRGHMAEGDAGSDQRQNEEIEAMAAIY 
GEEWCVIDDCAKI FC I R I SDD I DDP KWTLCLQVMLPNEY PGTAP 
P I YQLNAPWLKGQERADLSNSLEE IYIQN IGES I LYLWVEKI RD 
VL I Q KSQMTE PGPDVKKKTEEED VE C E DDL I LACQPES S VKALD 
FD3 SETRTEVEVEELPP IDHGIPITDRRSTFQAHLAPWCPKQV 
KMVLSKLYENKKIASATHN I YAYR I YCEDKQTFLQDCEDDGETA 
AGGRLLHLME I LNVKNVM VWSRWYGG I LLGPDRFKHINNCARN 
ILVEKNYTNSPEESSKALGKNKKVRKDKKRNEH 


6597 


2 


1026 


PRLPVRRYHGRRRLQGRSRGHMAEGDAGSDQRQNEEIEAMAAIY 
GEEWCVIDDCAKI FCIRISDDIDDPKWTLCLQVMLPNEYPGTAP 
PIYQLNAPWLKGQERADLSNSLEEIYIQNIGESILYLWVEKIRD 
VL I Q KS QMTEPGPD VKKKTE E EDVE CE DDL ILACQPESS VKALD 
FDISETRTEVEVEELPPIDHGIPITDRRSTFQAHLAPWCPKQV 
KMVLSKLYENKKIASATHNIYAYRI YCEDKQTFLQDCEDDGETA 
AGGRLLHLMEILNVKNVMVVVSRWYGGI LLGPDRFKHINNCARN 
I LVE KN YTNS PEES S KALGKNKKVRKDKKRNEH 


6598 


1099 


419 


PRVRWATTMAMSFEWPWQYRFPPFFTLQPNVDTRQKQLAAWCSL 
VLSFCRLHKQSSMTVMEAQESPLFNNVKLQRKLPVESIQIVLEE 
LRKKGNLEWLDKSKSSFLIMVmRPEEWGKLIYQWVSRSGONNSV 
FTLYELTNGEDTEDEEFHGLDEATLLRALQALQQEHKAEIITVS 
DGPRRQVLLAGTCLPLLLTSHLSRAFKRRQTQC P PKTGS VTP PD 
SKGLQS 


6599 


164 


1593 


KMAALTTLFKY I DENQDRY I KKLAKWVAI QS VSAWPEKRGE I RR 
MMEVAAADVKQLGGS VELVD I GKQKLPDGS E I PLPP ILLGRLGS 
DPQKKTVCIY3HLDVQPAALEDGWDSEPFTLVERDGKLHGRGST 
DDKGPVAGWINALEAYQKTGQEIPVNVRFCLEGMEESGSEGLDE 
L I FARKDTF FKD VD YVC I SDNYWLGKKKP C I TYGLRG I C YFF I E 
VECSNKDLHSGVYGGSVHEAMTDLILLMGSLVDKRGNILIPGIN 
EAVAAVTEEEHKLYDDI DFD I EE FAKDVGAQ I LLHSHKKD ILMH 
RWRYPSLSLHGIEGAFSGSGAKTVIPRKWGKFSIRLVPNMTPE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine / C-Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I*Isoleucine , K=Lysine, 
L=Leucine, M=Methionine , N^Asparagine, 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
WsTryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








WGEQVTSYbTKKFAELRSPNEFKVYMGHGGKPWVSDFSHPHYL 
AGRRAMKTVFGVEPDLTREGGSIPVTLTFQEATGKNVMLLPVGS 
ADDGAHS QNEKLNRYN Y I EGTKMLAAYL YE VSQLKD 


6600 


2 


934 


PG RL FR VAAME S AGL EQLLRELLL PDTERI RRATEQ LQ I VLRAP 
AALSALCDLLASAADPQIRQFAAVLTRRRLNTRWRRLAAEQRES 
LKS L I LTALQRETEHCVS LSLAQLS AT I FRKEGLEAWPQLLQLL 
QHSTHSPHSPEREMGLLLLSVWTSRPEAFQPHHRELLRLLNET 
LGEVGSPGLLFYSLRTLTTMAPYLSTBDVPLARMLVPKLIMAMQ 
TLIPIDEAKACEALEALDELLESEVPVITPYLSEVLTFCLEYAR 
NVALGNAI RI R I LCCLT FLVKVKS KALLKNRLLATLAAHP FPHC 
GC 


6601 


529 


1420 


P RAAARAP P P AVLRRDRRAATAPGAGE MTLHG P LAQR Y FLNH I E 
KI TTW QD P RKAMNQ P LNHMNLH P AVS S TPVPQRS MAVSQ PNLVM 
NHQHQQQ MAP S TLSQQNH P TQN P P AGLMSMPNAL TTQQQQQQKL 
RLQR I QMERER I RMRQE ELMRQEAALCRQLPMEAETLAP VQAAV 
NPPTMTPDMRSITNNSSDPFLNGGPYHSREQSTDSGLGLGCYSV 
PTTPEDFLSNVDEMDTGENAGQTPMNINPQQTRFPDFLDCLPGT 
NVDLGTLES EDLI PLFNDVE SALNKSE PFLTWL 


6602 


127 


617 


LIiDFPALPKFVLAQSPKAGKPSTMTSMTQSLREVI KAMTKARNF 
ERVLGKITLVSAAPGKVICEMKVEEEHTNAIGTLHGGLTATLVD 
NI STMALLCTERGAPGVSVDMN IT YMS PAKLGED I VITAHVLKQ 
GKTLAFTSVDLTNKATGKLIAQGRHTKHLGN 


6603 


79 


660 


PVG P S S LAARTGLGHLP FLHRLASSRGLDMDLLQFLAFLF VLLL 
SGMGATGTLRTSLDPSLEIYKKMFBVKRREQLLALKNLAQLNDI 
HQQYKILDVMLKGLFKVLEDSRTVLTAADVLPDGPFPQDEKLKD 
AFS HWENTAF FGD WLR F P RI VH YY FDHNSNWNLL I R WG I S FC 
NQTGVFNQGPHSPILSLM 


6604 


3 


686 


TSTAQRQGGERMSFRGGGRGGFNRGGGGGGFNRGGSSNHFRGGG 
GGGGGGNFRGGGRGGFGRGGGRGGFNKGQDQGPPERWLLGEFL. 
HP CE DD I VC KCTTDENKVP Y FNAP VYLENKEQ IG KVDE I FGQLR 
DFYFSVKLSENMKASSFKKLQKFYIDPYKLLPLQRFLPRPPGEK 
GPPRGGGRGGRGGGRGGGGRGGGRGGGFRGGRGGGGGGFRGGRG 
GGFRGRGH 


6605 


7 


648 


SGSRRGAMRAAGVGLVDCHCHLSAPDFDRDLDDVLEKAKKANW 
ALVAVAEHSGEFEKIMQLSERYNGFVLPCLGVHPVQGLPPEDQR 
SVTLKDLDVALP I IENYKDRLLAIGEVGLDFSPRFAGTGEQKEE 
QRQ VL I RQ IQLAKRLNL P VNVHS RSAGRPTINL LQEQG AEKVL L 
HAFDGRPS VAMEGVRAGYFFS I PPSII RSGQQKLVKQLPLTSI C 
LETDS PALGPEKQVRNEPWNI S I SAE YI AQVKG I S VEE VI EVTT 
QNALKLFP KLRHLLQK 


6606 


2 


1682 


FVE I R PRAE VANLSAHSAS P I QDAVLKRLS LLED I VYRQLNGLS 
KS LGL I E G YGGRG KGGLPATLS P AEE E KAKG PHB KYGYNS YLS E 
KI SLDRS I PDYRPTKCKELKYS KDLPQ IS 1 1 FI FVNEALSVI LR 
S VHS AVNHTPTHLLKE 1 1 LVDDNSDEEELKVPLEE YVHKRYPGL 
VKWRNQ KREGL I RAR I EG W KVATGQ VTGF F DAHVE FTAGW AE P 
VLSR I QENRKRVI LPS IDN I KQDNFEVQRYENSAHGYS WELWCM 
Y I S P P KDW W D AGD P S L P I RTP AM I GCS FWNRKF FGE I G LLDP G 
MD VY GGEN I ELG I KVW LCGG S ME VLPCSRVAHI ERKKKP YNSN I 
G FYT KRNALRVAE VWMDD YKSHVY I AWNL PLENPG I D I GDVS ER 
RALRKS L KCKNFQW YLDHVY P EMRR YKNTVAYGE LRNN KAKDVC 
LDQG P L ENHTAI L Y P CHGWG PQLAR YTKEG FliHLGALGTTTLL P 
DTRCLVDNSKSRLPQLLDCDKVKSSLYKRWNFIQNGAIMNKGTG 
RCLEVENRGLAGIDL I LRSCTGQRWTIKNS I K 


6607 


137 


986 


VPACAGIiKKEARSLLASPPRLLNTKLQASCRALFSPPIQSRQTT 
GI S FQGRGGAGPGVPTRTQVFAAMGAVMGTFSSLQTKQRRPSKD 
KIEDELEMTMVCHRPEGLEQLEAQTNFTKRELQVLYRGFKNECP 



529 



WO 01/53312 



PCTAJS00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine f G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, NsAsparagine , 
P-Proline, Q=Glutamine , R=Arginine, 
S=Serine, T-Threonine, V- Valine, 
W-Tryptophan, Y-Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








SGWNEDTFKQIYAQFFPHGDASTYAHYLFNAFDTTQTGSVKFE 
DFVTALS I LLRGTVHE KLRWTFNL YD I NKDG YINQEEMMD I VXA 
IYDMMGKYTYPVLKEDTPRQHVDVFFQKMDKNKDGIVTLDEFLE 
SCQEDDN I MRS LQLFQNVM 


6608 


224 


1140 


RPCFSSPTGLCPRLSYPMILLQHAVLPPPKQPSPSPPMSVATRS 
TGTLQLPPQKPFGQEASLPLAGEEELSKGGEQDCALEELCKPLY 
CKLCNVTLNS AQQAQAHYQGKNHGKKLRNYYAANS CPP PARMSN 
WEPAATPWPVPPQMGS FKPGGRV I LATENDYCKLCDASFSSP 
AVAQAHYQGKNHAKRLRLAEAQSNS FSESSELGQRRARKEGNEF 
KMMPNRRNMYTVQNNSGPYFNPRSRQRIPRDLAMCVTPSGQFYC 
SMCNVGAGEEME FRQHLESKQHKS KVSEQRYRNEMENLGYV 


6609 


1 


443 


FRLRCRRFRVAGGRLAGAGLRESRVPAPEQRLSALTULSWSAVT 
PAAE PGNFQLS PAEPRGPLAS PVRAAPRAPCPAAEMSELNTKTS 
PATNQ AAGQE E KGKAGNVKKAE E E E E I D IDLTAP ETE KAALAI Q 
GKFRRFQKRKKDPSS 


6610 


319 


881 


GRKS LCNLH I F I RFPLTYPDMYMGMMCTAKKCG IRFQP PAIILI 
YES E I KGK I RQ R I MP VRN FS KFSDCTRAAEQL KNN PRH KS YLE Q 
VSLRQLEKLFSFLRGYLSGQSLAETMEQIQRETTIDPEEDLNKL 
DDKELAKRKS IMDELFEKNQKKKDDPNFVYDIEVEFPQDDQLQS 
CGWDTESADEF 


6611 


978 


212 


PGCSGAGSRVWWLPALRHLAMGSTESSEGRRVS FGVDEEERVRV 
LQG VRLSENWNRMKE PS S P P PAP TSSTFGLQDGNLRAPHKES T 
LPRSGSSGGQQPSGMKEGVKRYEQEHAAIQDKLFQVAKREREAA 
TKHSKASLPTGEGSISHEEQKSVRLARELESREAELRRRDTFYK. 
EQLERIERKNAEMYKLSSEQFHEAASKMESTIKPRRVEPVCSGL 
QAQ I LH CYRDRPHEVLL CS DL VKA YQRCVS AAHKG 


6612 


1724 


992 


VSTHASALSRTQGQPQRQPRAAASGAGAGTAGGGGSGGAEGSKM 
STEAQRVDDSPSTSGGS S DGDQRESVQQEPEREQVQP KKKEGKI 
SS KTAAKLSTS AKR I QKE LAE I TLD P P PNCS AGPKGDNI YE WRS 
TILGPPGSVYEGGVFFLDITFSPDYPFKPPKVTFRTRIYHCNIN 
SQGVI CLD ILKDNWS PALTI S KVLLS I CSLLTDCN PADPLVGS I 
ATQYMTNRAEHDRMARQWTKRYAT 


6613 


130 


748 


ELELSSNMPEQSNDYRVAVFGAGGVGKSSLVLRFVKGTFRESYI 
PTVEDTYRQVISCDKS ICTLQ I TDTTGSHQFPAMQRLS IS KGHA 
F I L V YS I TSRQS LEELKP IYEQICEI KGDVE S I P I ML VGN KCD E 
S P SRE VQS S EAEAIARTW KCAFMETSAKLNHNVKELFQELLNLE 
KRRTVSLQIDGKKSKQQKRKEKLKGKCVIM 


6614 


3 


1191 


S S AAE AMRVLVRRCWGPPLAHGARRGRPS PQWRALARLGWEDCR 
DSRWEKPPWRVLFFGTDQFAREALRALHAARENKEEELIDKLE 
WTMPSPSPKGLPVKQYAVQSQLPVYEWPDVGSGEYDVGWASF 
GRLLNEAL I LKFP YG I LNVHPS CLPRWRG PAP VI HTVIjHGDTVT 
GVTIMQ I RPKRFDVG P I LKQETVP VPPKSTAKELEAVLSRLG AN 

EQI FRLYRAIGNI I PLQTLWMANTIKLLDLVEVNSSVLADPKLT 
GQALIPGSVIYHKQSQILLVYCKDGWIGVRSVMLKKSLTATDFY 
NG YLH P WYQ KNSQ AQ P S QCRFQTLRLPTKKKQ KKTVAMQQC I E 


6615 


832 


35 


GRVGAGASAMSELPGDVRAFliREHPSLRliQTDARKVRCILTGHE 
L P CRL P ELQVYTRG KK YQRL VRAS P AFD YAE F E PH I VPS TKNPH 
QLFCKLTLRHINKCPEHVLRHTQGRRYQRALCKYEECQKQGVEY 
VPACLVHRRRRRBDQMDGDGPRPREAFWEPTSSDEGGAASDDSM 
TDLYPPELFTRKDLGSTEDGDGTDDFLTDKEDEKAKPPREKATD 
EGRRETT VYRGLVQ KRG KKQLGSLKKK PKS HHRKP KS FS SCKQS 
G 


6616 


347 


1886 


LLPPCQGARPLSSPPHASEDNLFLFWNCILCAFPHPSPQPLQYP 
VWPLLLVITQIPAPRHLRNRPFSFSRGGLDSFSGSLSTPSICRS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
resiuuc ul 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I*=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline / Q«Glutaraine, R=Arginine, 
S»Serine, T-Threonine, V-Valine, 
W«=Tryptophan, Y=Tyxosine, X=Un)uiown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PAWVKMAPWPPKGLVPAVLWGLSLFLNLPGPIWLQPSPPPQSSP 
PPQPHPCHTCRGLVDSFNKGLERTIRDNFGGGNTAWEEENLSKY 
KDS ETRLVE VLEG VCS KS D FE CHRLLE LS E ELVE S WW FHKQQEA 
PDLFQ WLCSDS L KL C C P AGT FG P S CLP CPGGTER P CGG YGQCEG 
EGTRGGSGHCDCQAGYGGEACGQCGLGYFEAERNASHLVCSACF 
G PCARCSG P EESNCLQCKKG WALHHLKCVD I DECGTEGANCGAD 
QFCVNTEGSYECRDCAKACLGCMGAGPGRCKKCSPGYQQVGSKC 
LDVDECETEVCPGENKQCENTEGGYRCICAEGYKQMEGICVKEQ 
I PE SAGFFS EMTEDELWLQQMFFGI 1 1 CAIATLAAKGDLVFTA 
I FIGAVAAMTGYWLSERSDRVLEGFIKGR 


6617 


118 


673 


VWMAWQVSLLELEDRLQCP I CLEVFKESLMLQCGHS YCKGCLVS 
L S YHLDTKVRCPMCWQAVDGS S SLPNVS LAWVIEALRLPGDPE P 
KVCVHHRNPLSLFCEKDQELICGLCGLbGSHQHHPVTPISTVCS 
RMKEELAALFSELKQEQKKVDE L I AKLVKNRTRI DGSAPSLC PC 
LGPATFTFL 


€618 


548 


136 


DGKVARRAPNSPAFQNDlYPLVSAPRATTAESPWSKVLQNTQCR 
NVPKMTSERSRIPCLSAAAAEGTGKKQQEGRAMATLDRKVPSPE 
AFLGKPWSSWIDAAKLHCSDNVDLEEAGKEGGKSREVMRLNKEA 
WKYGT 


6619 


246 


842 


PAS S E VLT AAVM FLLLNCI VAVS QNMG I G KNG DL PR P P LRJNE FR 
YFQRMTTTSSVEGKQNLVIMGRKTWFS I PEKNRPLKDRINLVLS 
RELKEPPQGAHFLARSLDDALKLTERPELANKVDMIWIVGGSSV 
YKEAMNHLGHLKLFVTRIMQDFESDTFFSEIDLEKYKLLPEYPG 
I LSDVQEGKHI KYKFEVCEKDD 


6620 


3 


1879 


NS RVDDF VARARKAAENEASQE S ALGAYS P VDYMS I T S FPRLPE 
DEPAPAAPLRGRKDEDAFLGDPDTDPDSFLKSARLQRLPSSSSE 
MGS Q DG S P LRE TRKD P FS AAAAE CS CRQDGLTVI VTACLTFATG 
VT VALVMQ I Y FGDPQ I FQQGAWTDAAR CTS LG I E VLS KQGS S V 
DAAVAAALCLG I VAPHS SGLGGGGVMLVHD I RRNESHL I DFRES 
APGALREETLQRSWETKPGLLVGVPGMVKGIiHEAHQLYGRLPWS 
QVIJVFAAAVAQIX3FNVTHDLARALAEQLP PNMS ERFRETFLPSG 
R P PLPGS LLHRPDLAEVLDVLGTSGPAAF YAGGNLTLEMVAEAQ 
HAGG V I TE EDFSNYSALVEKP VCGVYRGHLVL3 PPPPHTGPAL I 

S AIjN I LEG fnltslvsreqalhwvaetlk i alalas rlgd p vyd 

S T I TESMDDMIiSKVEAAYLRGH INDSQAAPAPLLPVYELDGAPT 
AAQVLIMGPDDFIVAMVSSLNQPFGSGLITPSGILLNSQMLDFS 
WPNRTANHSAPSLENSVQPGKRPLSFLLPTWRPAEGLCGTYLA 
LGANGAARGLSGLTQVRFTPWLAFFSRE PS CGLDCRCLS YLWLV 
SIPHAANMG 


6621 


1 


662 


VQGITSYQQRLQALRKEKSRDAARSRRGKENFEFYELAKLLPLP 
AAITSQLDKASIIRLTISYLKMRDFANQGDPPWNLRMEGPPPNT 
SVKVIGAQRRRSPSALAIEVFEAHLGSHILQSLEK3YVFALNQEG 
KFLY I SETVS I YLGLSQVELTGSS VFDYVHPGDHVEMAEQLGMK 
LPPGRGLLSQGTAEDGASSASSSSQSETPEPWCFPPASDQFLL 


6622 


2 


319 


GRASGAQEETEAGGPERARAMEANMPKRKEPGRSLRIKVISMGN 
AEVGKSCI IKRYCEKRFVSKYLATIGIDYGVTKVHVRDREIKVN 
IFDMAGHPFFYEVRKPF 


6623 


1886 


189 


KAL FEKVKKFRLHVE EGD ILYAMYVRQTVLKVI KFL 1 1 1 AYNS A 
LVSKVQFTVDCNVDIQDMTGYKNFSCNHTf4AHLFSKLSFCYLCF 
VS I YGLTCLYTLYWLFYRSLREYSFEYVRQETGFDDI PDVKNDF 
AFMLHMIDQYDPLYSKRFAVFLSEVSENKLKQLNLNKEWTPDKL 
RQKLQTNAHNRLELPLIMLSGLPDTVFEITEIiQSLKLE 1 1 KNVM 
I P AT I AQLDNLQELS LHQCSVKIHSAALS FLKENLKVLS VKFDD 
MRELPPWMYGLRNLEELYLVGSLSHDISRNVTIjESLRDLKSLKI 
LS I K^^SKIPQAWDVSSHLQKMCIHNTCTKLVMLNNLKKMTN 
LTELELVKCDLERIPHAVFSLLSLQELDLKENNLKSIEEIVSFQ 
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SEQ 
ID 
NO : 


Predicted 1 
beginnino 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A»Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y»Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«poasible nucleotide insertion) 








HLRKLTVLKLWHNSITYIPEHIKKLTSLERLSFSHNKIEVLPSH 
LFLCNKIRYLDLSYNDIRFIPPEIGVLQSLQYFSITCNKVESLP 
DELYFCKKLKTLKIGKNSLSVLS PKIGNLLFLS YLDGKGNHFE I 
L P PE LGD CRALKRAGL WED AL FETL P S D VREQMKTE 


6624 


218 


1786 


GSRRGGGSRIPAVSTHVAPGRSVLRPFASGALRLRSLVKALGGC 
RGRPSGLAHLSQETSHWRAKRSGRACLGDFPGEILRSFIMKCTA 
REWLRVTTVLFMARAI PAMWPNATLLE KLLEKYMDEDGEWWI A 
KQRGKRAI TDNDMQS I LDLHNKLRSQVYPTASNME YMTWDVELE 
RSAESWAESCLWEHGPASLLPSIGQNLGAHWGRYRPPTFHVQSW 
YDEVFCDFSYPYEHECNPYCPFRCSGPVCTHYTQWWATSNRIGC 
AINLCHNMNIWGQIWPKAVYLVCNYSPKGNWWGHAPYKHGRPCS 
ACP ? S FGGGCRENLCYKEGSDRYYPPRE EETNE IERQQS QVHDT 
HVRTRSDDS SRNEVI S AQQMSQ I VS CEVRLRDQCKGTTCNRYEC 
PAGCLDS KAKVIGS VHYEMQSS I CRAAIHYG 1 1 DNDGGWVD ITR 
QGRKHYF I KSNRNG I QTIGKYQSANS FTVS KVTVQAVTCETTVE 
QLCPFHKPASHCPRVYCPRKLYASKSTLCSCNWNSSLF 


6625 


1124 


543 


PG PRGGGGSLLSTKALGRSRGLGMHPGPS SGGTEGGVPTALRPP 
GPLVPSTSDDNLLKNIELFDKLALRFHGRLLFLKDVLGDEICCW 
SFYGQGRKIAEVCCTSIVYATEKKQTKVBFPEARIFEETLNILI 
YETPRGPDPALLEATGGAAGAGGAGRGEDE ENREHRVRR IHVRR 
H I THDERPHGQQI VFKD 


6626 


3 


1498 


SAVEFVYTDRFHLILGISVEFIiCSLRSDATMESITACLHALiQAL 
LDVPWPRSKIGSDQDSGIELLNVLHRVILTRESPSIQLASLEW 
RQI I CAAQEHVKEKRRSAEVDDGAAEKETLPEFGEGKDTGGLVP 
GKS L VFATLELCVC I LVRQLPELNPKLTGS PGVKATKPQ I LLED 
GS RLVSAAL VI L S EL PA VCS PEGS ISILPTIL YL PIGVLRE TAV 
KLPGGQLS S TVAASLQ ALKG I LS S PMARAE KS RTAWTD LLRS AL 
TT I LDCWDPVDETHQELDE VSLLTAI TVF I LS TS PEVTT I P CLQ 
KRC I DKFKATLE I KDPWQ I KTYQLLHS I FQ YPNPAVS YP YI YS 
LAS C I MEKLQE I DKRKP ENTAELE I FQEG I KVLETL VT VAE EHH 
RAQLVACLLP ILI S FLLDENSLG SATS I MRNLHD FALQNLMQ IG 
PQYSSVFKSLVASSPALKARLEAAIKGNQESVKVKIPTSKYTKS 
PGKNSSIQLKTSFL 


6627 


1 


697 


G I PHLSSRDMTGTPGAVATRDGEAPERSPPCS PS YDLTGKVMLL 
G DTG VG KTCFL I Q FKDGAFLSGT F I AT VG I D FRNKWT VDG VRV 
KLQ I WDTAGOE RFRS VTHAYYRDAQALLLXYD I TNKSS FDN I RA 
WLTEIHEYAQRDWIMLLGNKADMSSERVIRSEDGETLAREYGV 
PFLETS AKTGMNVELAFLAI AKELKYRAGHQADE PSFQ I RD YVE 
SQKKRSSCCSFM 


6628 


1 


1861 


QCAEFGGGSGGGGGSGGGGSGGGRGAGGEENKENERPSAGS KAN 
KE FGDS LS LE I LQ I IKESQQQHGLRHGDFQRYRG YCSRRQRRLR 
KTIjNFKMGNRHKFTGKKVTEELIjTDNRYIiljLVLMDAERAWSYAM 
QLKQEANTEPRKRFHLLSRLRKAVKHAEELERLCESNRVDAKTK 
LEAQAYTAYLSGMLRFEHQEWKAAIEAFNKCKTIYEKLASAFTE 
EQAVL YNQR VE E I S PN I R Y CAYN IGDQS AI NE LMQMRLRS GGTE 
GLLAEKLEALI TQTRAKQAATMSEVEWRGRTVPVKIDKVRI FLL 
GLADNEAAIVQAE S EETKERLFESMLSECRDA I Q WREELKPDQ 
KQRD Y I LEGE PG KVSNLQ YLHS YLTY IKLSTAI KRNENMAKGLQ 
RALLQQQPEDDSKRSPRPQDLIRLYDIILONLVELLQLPGLEED 
KAFQKEIGLKTLVFKAYRCFFIAQSYVLVKKWSEALVLYDRVLK 
YANEVNSDAGAF ICNS LKDL PDVQELI TQVRSE KCS LQAAAI LDA 
NDAHQTETSSS QVKDNKPLVERFETFCLDPSLVT KQ ANLVHF P P 
GFQP I PCKPLFFDLALNHVAFPPLEDKLEQKTKSGLTGYIKG IF 
GFRS 


6629 


5653 


4549 


GATPLGSVGGRTGKMDAATLTYDTLRFAEFEDFPETSEPVW I LG 
RKYS I FT E KDE I LS D VAS RL W FTYRKN FP AI GGTG P TS DTG WG C 
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(A-Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine / 
P=Proline, Q=Glutamine , R^Arginine, 
S=Serine, T=Threonine, V= Valine, 
WsTryptophan, Y*Tyrosine, X-Unknown, **Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








MLRCGQMIFAQALVCRHLGRDWRWTQRKRQPDSYFSVLNAFIDR 
KDS YYS IHQ I AQMGVGEGKS IGQWYG PNTVAQVLKKLAVFDTWS 
SLAVHIAMDNTWMEEIRRLCRTSVPCAGATAFPADSDRHCNGF 
PAGAEVTNRPS PWRPLVLL I PLRLGLTDIKEAYVETLKHCFMMP 
QSLGVIGGKPNSAHYFIGYVGEELIYLDPHTTQPAVEPTDGCFI 
PDES FHCQHPPCRMSIAELDPS I AWRGGHLSTQAPGAECCLGM 
TRKTFGFLRFFFSMLG 


6630 


2 


423 


LVQCGGIRRRSAWGAMPGRHVSRVRALYKRVLQLHRVLPPDLKS 
LGDQYVKDEFRRHKTVGSDEAQRFLQEWEVYATAliLCXJANENRQ 
NSTGKACFGTFLPEEKLNDFRDEQIGQLQELMQEATKPNRQFSI 
SESMKPKF 


6631 


2 


423 


LVQCGGIRRRSAWGAMPGRHVSRVRALYKRVLQLHRVLPPDLKS 
LGDQYVKDEFRRHKTVGSDEAQRFLQEWEVYATALLQQANENRQ 
NSTGKACFGTFLPEEKLNDFRDEQ1GQLQELMQEATKPNRQFSI 
SESMKPKF 


6632 


1273 


588 


WNS RGRTQRGAAPLAP AAAMKAWQRVTRAS VTVGGEQ ISAIGR 
G I CVLliG I S LEDTQKELEHMVRKI LNLRVFEDESGKHWS KS VMD 
KQYEILCVSQFTLQCVLKGNKPDFHIAMPTEQAEGFYNSFLEQL 
RKTYRPELIKDGKFGAYMQVHIQNDGPVTIELESPAPGTATSDP 
KQLSKLEKQQQRKEKTRAKGPSESSKERNTPRKEDRSASSGAEG 
DVSSEREP 


6633 


1145 


617 


ATGRHEGVPTLEGIIQQLVNGIITPATIPSLGPWGVLHSNPMDY 
AWG ANGLD AI I TQLLNQ F ENTGP P P AD KEK I Q ALPTVP VTE EHV 
GSGLECPVCKDDYALGERVRQLPCNHLFHDGCIVPWLEQHDSCP 
VCRKSLTGQNTATNPPGLTGVSFSSSSSSSSSSSPSNENATSNS 


6634 


1 


1134 


CGGIPRKGSGPRRRLPMARLRDCLPRLMLTLRSLLFWSLVYCYC 
GLCAS I HLLKLLWSLGKG PAQTFRRPAREHPPACLSDPSLGTHC 
YVRIKDSGLRFHYVAAGERGKPLMIiLLHGFPEFWYSWRYQLREF 
KS E YR WALDLRG YGETDAP I HRQN Y KLDCLI TD I KD ILDS LG Y 
SKCVLIGHDWGGMIAWLIAICYPEMVMKLIVINFPHPNVFTEYI 
LRHPAQLLKSSYYYFFQIPWFPEFMFSINDFKVLKHLFTSHSTG 
IGRKGCQLTTEDLEAYIYVFSQPGALSGPINHYRNIFSCLPLKH 
HMVTTPTLLLWGENDAFMEVEMAEVTRFYVKNYFRLTILSEASH 
WLQQDQPDIVNKLIWTFLKEETRKKD 


6635 • 


1420 


470 


EMRAGQQLASMLRWTRAWRLPREGLG PHGPSFARVP VAPS SS SG 
GRGGAEPRPLPLSYRLLDGEAALPAWFLHGLFGSKTNFNSIAK 
I LAQQTGRRVLT VDARNHGDS PHS P DMSYEIMSQDLQDLLPQLG 
LVP CVWGHSMGGKTAMLLALQRPELVERL IAVD IS P VESTGVS 
HFATYVAAMRAINIADELPRSRARKLADEQLSSVIQDMAVRQHL 
LTNLVEVDGRFVWRVNLDALTQHLDKILAFPQRQESYLGPTLFL 
LGGNSQFVHPSHHPEIMRLFPRAQMQTVPNAGHWIHADRPQDFI 
AAIRGFliV 


6636 


1514 


1801 


SFCMFSHKQDSHFQAVPVQEKKKRLRRAPWRAFAQPQRLKHPAE 
QPIVRQCLQRPPLCGVLGPVQQQLPPSLGPVLSPHSDPGWCRVD 
DGGDGVF 


6637 


2 


1501 


CSSSPCFHDGTCVLDKAGSYKCACLAGYTGQRCENLLEAGKSKI 
KASEDSLSVLEERNCSDPGGPVNGYQKITGGPGLINGRHAKIGT 
WSFFCNNSYVLSGNEKRTCQQNGEWSGKQPICIKACREPKISD 
LVRRRVLPMQVQSRETPLHQLYSAAFSKQKLQSAPTKKPALPFG 
DLPMGYQHLHTQLQYECISPFYRRLGSSRRTCLRTGKWSGRAPS 
CI P ICGKI ENITAPKTQGLRWPWQAAI YRRTSGVHDGSLHKGAW 
FLVCSGALVNERTW^AAHCVTOLGKVTMIKTADLKVVLGKFYR 
DDDRDEKT IQSLQI SAI I LHPNYDP I LLDADIAI LKLLDKAR I S 
TRVQ P I CLAASRDLSTSFQESH I TVAG WNVLADVRSPGFKNDTL 
RSGWS WDSLLCEEQHEDHG I PVSVTDNMFCAS WEPTAPSDIC 
TAETGG I AAVS FPGRAS PEPRWHLMGLVS WS YDKTCSHRLSTAF 
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Glutamic Acid, F=Phenylalanine, G=Glycine, 
H-Histidine, I*Isoleucine, K-Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
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W=Tryptophan, Y»Tyrosine, X-Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








TKVLPFKDWIERNMK 


6638 


1391 


224 


GG I PQAGGKMAAPW WRAALCECRRWRG FSTS AVLGRRT P PLGPM 
PNSD IDLSNLERLE KYRS FDRYRRRAEQEAQAPHW WRT YRE YFG 
EKTDP KEKI DIGLPPP KVSRTQQLLERKQAI QELRANVEEERAA 
RLRTASVPLDAVRAEWERTCGPYHKQRLAEYYGLYRDLFHGATF 
VP R VPIjHVAYAVGE DDIiM P VYCGNEVT PTEAAQ AP EVT YEAEEG 
S L WTLLLTS LDGHLLE PDAE YLHWLL TNI PGNRVAEGQ VTCP YL 
PPFPARGSGIHRIiAFLLFKQDQPIDFSEDARPSPCYQLAQRTFR 
TFDFYKKHQETMTPAGLSFFQCRWDDSVTY I FHQLLDMRE PVFE 
FVRP P P YHP KQ KRF PHRQP LR YLDR YRDSHE PT YG I Y 


6639 


2046 


1268 


I GC F I MDGGDDGNL 1 1 KKRFVSEAELDERRKRRQEEWEKVRKPE 
DPEECPEEVYDPRSLYERLQEQKDRKQQEYEEQFKFKNMVRGLD 
EDETNFLDEVSRQQELIEKQRREEELKELKEYRNNLKKVGISQE 
NKKE VEKKLTVKP I ETKNKFSQAKLLAGAVKHKS S ESGNS VKRL 
KPDPEPDDKNQEPSSCKSLGNTSLSGPSIHCPSAAVCIGILPGL 
GAYSGSSDSESSSDSEGTINATGKIVSSIFRTNTFLEAP 


6640 


117 


1043 


VLEPPDVSMAESEDRSLRIVLVGKTGSGKSATANTILGEEIFDS 
RIAAQAVTKNCQKASREWOGRDLLVVDTPGLFDTKESLDTTCKE 
ISRCIISSCPGPHAIVLVLLLGRYTEEEQKTVALIKAVFGKSAM 
KHMVIbFTRKEELEGQSFHDFIADADVGLKSIVKECGNRCCAFS 
NSKKTSKAEKESQVQELVELIEKMVQCNEGAYFSDDIYKDTEER 
LKQREEVLRKI YTDQLNEEI KojVEEDKHKSEEKKEKEI KLLKLK 
YDEKIKNIREEAERNIFKDVFNRIWKMLSEIWHRFLSKCKFYSS 


6641 


1 


894 


SAAVGRRSEVRGCAPRPRLRRSARRMDPVPGTD'SAPLAGLAWSS 
AS APPPRGFSAI S CTVEGAPAS FGKS FAQKSG Y FLCLSSLGS LE 
NPQENWAD I Q I WDKS PLPLGFS PVCDPMDS KAS VSKKKRMCV 
KLLPLGATDTAVFDVRLSGKTKTVPGYLRIGDMGGFAIWCKKAK 
APRPVPKPRGLSRDMQGLSLDAASQPSKGGLLERTASRLGSRAS 
TLRRNDSIYEASSLYGISAMDGVPFTLHPRFEGKSCSPLAFSAF 
GDLTIKSLADIEEEYNYGFWEKTAAARLPPSVS 


6642 


22 


1296 


PLEERMMTXMDPNDQAQRDI I FELRRIAFDAESDPSNAPGSGTE 
KRKAM YTKD YKMLGFTNH IN P AMD FTQT PPGM LALDNML YLAKV 
HQDTYIRIVLENSSREDKHECPFGRSAIELTKMLCEILQVGELP 
NEGRNDYHPMFFTHDRAFEELFGICIQLLNKTWKEMRATAEDFN 
KVMQWREQ I TRALPS KPNS LDQFKS KLRSLS YSE I LRLRQS ER 
MSQDDFQSP P I VELREKIQPE ILELI KQQRLNRLCEGSSFRKIG 
NRRRQERFW YCRLAIiNHKVLHYGDLDDNPCXSEVTFESLQEKI PV 
ADIKAIVTGKDCPHMKEKSALKQNKEVLELAFSIL.YDPDETLNF 
IAPNKYE YC I W I DGLSALLGKDMSS ELTKSDLDTLLSMEMKLRL 
LDLEN IQI PEAPPPI PKE P S S YDFVYHYG 


6643 


3049 


2265 


SLHAP AEGRTRGRLAE KP KMLTRKI KLWD I NAH I T CRLCS G YL I 
DATT VTE CLHTFCRSCLVKYLEENNTCPTCR I VI HQSHPLQY I G 
HDRTMQDIVYKLVPGIjQEAEMRKQREFYHKLGMEVPGDIKGETC 
SAKQHLDSHRNGETKADDSSNKEAAEEKPEEDNDYHRSDEQVSI 
CLECNSSKLRGLK^KWIRCSAQATVLHLKKFIAKKLNLSSFNEL 
DILCNEEIU5KDHTLKFVVVTRWRFKKAPLLLHYRPKMDLL 


6644 


1489 


290 


FRPLATEPRGSSPVQLVSSTMSVRTLPLLFLNLGGEMLYILDQR 
LRAQNI PGDKARKVLNDII STMFNRKFMEELFKPQELYSKKALR 
TVYERIiAHAS I MKLNQASMDKLYDLMTMAFKYQVLLCPRPKDVL 
LVTFNHLDT I KGFIRDS PT I LQQVDETLRQLTE I YGGLS AGEFQ 
LIRQTLLIFFQDLHIRVSMFLKDKVQNNNGRFVLPVSGPVPWGT 
EVPGL I RMFNNKGEE VKR I E FKHGGNYVPAPKEGS FEFYGDR VL 
KLGTNMYSVNQPVETHVSGSS KNLASWTQES IAPNPLAKEELNF 
LARLMGGMEIKKPSGPEPGFRLNLFTTDEEEEQAALTRPEELSY 
EV IN I QATQDQQRSEELAR IMGEFE I TEQPRLS TSKGDDLLAMM 
DEL 
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W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6645 


6530 


4646 


FVEGLAGYVYKAASEGKVLTIAALLLNRSESDIRYLLGYVSQOG 
GQRSTPLI IAARNGHAKVVRLLLEHYRVQTQQTGTVRFDGYVID 
G ATALWCAAG AGHFE WKLL VS HGANVNHTTVTNS TP LRAACFD 
GRLDIVKYLVENNANIS IANKYDNTCLM I AAY KGHTDWRYLLE 
QRADPNAKAHCGATALHFAAEAGHIDI VKELI KWRAAIWNGHG 
MTPLKVAAES CKADWELLLSHADCDRRSRI EALELLGAS FAND 
RENYD 1 1 KT YHYLYLAMLERFQDGDNILEKEVIiPP IHAYGNRTE 
CRNPQELESIRQDRDALHMEGLIVRERILGADNIDVSHPIIYRG 
AVYADNMEFEQCIKLWLHALHLRQKGNRNTHKDLLRFAQVFSQM 
IHLNETVKAPDIECVLRCSVLEIEQSMNRVKNISDADVHNAMDN 
YECNLYTFLYLVCI STKTQCS EEDQCKINKQ I YNL IHLDPRTRE 
GFTLLHLAVNSNTPVDDFHTNDVCSFPNALVTKLLLDCGAEVNA 
VDNEGNSALHI I VQYNRP I SDFLTLHS I I I S LVE AGAHTDMTNK 
QNKTP LDKS TTGVS E I LLKTQMKMS LKCLAARAVRAND IN YQDQ 


6646 


176 


890 


PSSRMNHLPEDMENALTGSQSSHASLRNIHSINPTQLMARIESY 
EGREKKGI SDVRRTFCLFVTFDLLFVTLLWI I ELNVNGGIENTL 
E KE VMQ YD Y YS S YF D I FLLAVFR F KVL I LAYAVCRLRHWWAI AL 
TTAVTSAFLIAKVILSKLFSQGAFGYVLPIISFILAWIETWFLD 
FKVLPQEAEEENRLLIVQDASERAALIPGGLSDGQFYSPPESEA 
GSEEAEEKQDSEKPLLEL 


6647 


176 


890 


PSSRMNHLPEDMENALTGSQSSHASLRNIHSINPTQLMARIESY 
EGREKKGI SDVRRTFCLFVTFDLIjFVTLLWI I ELNVNGGI ENTL 
EKEVMQYDYYSSYFDIFLIiAVFRFKVLILAYAVCRLRHWWAIAL 
TTAVTSAFLLAKVILSKLFSQGAFGYVLPIISFILAWIETWFLD 
FKVLPQEAEEENRLLIVQDASERAALIPGGLSDGQFYSPPESEA 
GSEEAEEKQDSEKPLLEL 


6648 


413 


897 


RNCWNCFTK Y FNS PPED I DHKDS YL I TRS I MAE PD Y I EDDNPEL 
I R PQKL INP VKTSRNHQDLHRE LLMNQKRGLA PQNKPBLQKVME 
KRKRDQVIKQKEEEAQKKKSDLEIELLKRQQKLEQLELEKQKLQ 
EEQENAPEFVKVKGNLRRTGQEVAQAQES 


6649 


1357 


832 


W I PRAAG IRHE VKWDVKE I MSQHNI YVDALLKE FEQFNRRLNEV 
S KRVR I PLP VSNI LWEHC I RLANRTI VEGYANVXKCSNEGRALM 
QLDFQQFLMKLEKLTDIRP I PDKEFVETYIKAYYLrENDMERWI 
KEHRE YSTKQLTNLVNVCLGSHINKKARQKLLAAIDD IDR PKR 


6650 


32 


765 


LVPLVFS LL VQ S CKQVYRS I AMKFVP C LiLLVTL S CLGTLGQAP R 
QKQGS TGEEFHFQTGGRDS CTMRPSS LGGGAGEVWLRVDCRNTD 
QTYWCEYRGQPSMCQAFAADPKSYWNQALQELRRLHHACQGAPV 
LRPS VCREAG PQAHMQQ VTS S LKGS P EPNQQ P EAGTPSLRP KAT 
VKLTEATQLGKDSMEELGKAKPTTRPTAKPTQPGPRPGGNEEAK 
KKAWEHCWKPFQALCAFLI S FFRG 


6651 


3425 


1353 


AKELLKVGDFSLCAGP YQNTADTMENLS KEPLAS FVSESFDI SA 
CGIATEHVK I DNSGEGLTAEAGSETLSRDGEVGVNSDMHYE LSG 
DSDLDLLGDCRNPRLDLEDSYTLRGSYTRKKDVPTDGYESSLNF 
HNNNQ ED WG CS S WV PGME TS LPPGHWTAAVKKE E KCVP P YVQ I R 
DLHG I LRTYAN FS I TKE L KDTMRTSHGLRRH P S FS ANCGL P S S W 
TSTWQVADDLTQNTLDLEYLRFAHKLKQTIKNGDSQHSASSANV 
FPKESPTQISIGAFPSTKISEAPFLHPAPRSRSPLLVTWESDP 
RPQGQPRRGYTASSIjDSSSSWRERCSHNRDLRNSQRNHTVSFHL 
NKLKYNSTVKESRNDISLILNEYAEFNKVMKNSNQFIFQDKELN 
DVSGEATAQEMYLPFPGRSAS YEDII IDVCTNLHVKLRSVVKEA 
CKSTFLF YLVETEDKS FFVRTKNLLRKGGHTE I E PQHFCQAFHR 
ENDTLIIIIRNEDISSHLHQIPSLLKLKHFPSVIFAGVDSPGDV 
LDHTYQE L FRAGGFVI SDDK I LEAVTLVQLKE 1 1 KI LEKLNGNG 
RWKWLLHYRENKKLKEDERVDSTAHKKNIMLKSFQSANIIELLH 
YHOCDSRSSTKAE ILKCLLNLQIQHIDARFAVLLTDKPTI PREV 
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S=Serine, T= Threonine, V-Valine, 
W-Tryptophan, Y«Tyrosine, X= Unknown , *=Stop 
Codon, /"possible nucleotide deletion, 
\spossible nucleotide insertion) 








FENNG I L VTD VNNF I EN I E K I AAP FR S S YW 


6652 


2 


1343 


IPGSTISCSCHSRRLRGGSPAPRLSLGAASPRPRPPSLPLPLPL 
PFPLFLPTRPAERAW I RSRRAS EWVGKME VPRLDHALNS PTSPC 
EEVIKNLSLEAIQLCDRDGNKSQDSGIAEMEELPVPHNIKISNI 
TCDS FKISWEMDS KSKDRITHYFIDLNKKENKNSNKFKHKDVPT 
KLVAKAVPLPMTVRGHWFLSPRTEYTVAVQTASKQVDGDYWSE 
WS E 1 1 E FCTAD YS KVHLTQLLE KAE V I AGRML KFS V F YRNQHKE 
YFDYVREHHGNAMQPSVKDNSGSHGSPISGKLEGIFFSCSTEFN 
TGKPPQDSPYGRYRFEIAAEKLFNPNTNLYFGDFYCMYTAYHYV 
I L VI AP VGS PGDE F C KQRL PQLNS KDNKFLT CTEE DG VL VYHHA 
QDVI LEVI YTDPVDLS LGTVAE ITX5HQLMS LSTANAKKDPSCKT 
CNISVGR 


6653 


170 


1910 


FFLE PRLR P F PASRARFVPARTR PS PLH PCC FCFEGGG S MLS PQ 
RVAAAASRGADDAMES S KPG P VQWLVQKDQH S FELDEKALAS I 
LLQDHIRDLDWWSVAGAFRKGKSFILDFMLRYLYSQKESGHS 
NWLGDPEEPLTGFSWRGGSDPETTGIQIWSEVFTVEKPGGKKVA 
WLMDTQGAFDS QS TVKDCAT I FALS TMTS S VQI YNLSQNIQED 
DLQQLQLFTEYGRLAMDEIFQKPFQTLMPLVRDWSFPYEYSYGL 
QGGMAFLDKRLQVKEHQHEE IQNVRNHIHS CFSDVTCFLLPHPG 
LQVATSPDFDGKLKD IAGEFKEQLQALI PYVLNPSKLMEKEING 
SKVTCRGLLEYFKAYIKIYQGEDLPHPKSMLQATAEAYNLAAAA 
SAKDIYYNNMEEVCGGEKPYLSPDILEEKHCEFKQLALDHFKKT 
KKMGGKDFS FRYQQE LEE E I KELYENFCKHNGS KNVFSTFRTPA 
VLFTGIVALYIASGLTGFIGLEWAQLFNCMVGLLLIALLTWGY 
IRYSGQYRE LGGAID FGAAYVLEQAS SHIGNS TQATVRDAWGR 
PSMDKKAQ 


£654 


1 


705 


RTSLSPSQCSSFNLAMASAGMQILGVVLTLLGWVNGLVSCALPM 
WKVTAFIG^SIVVAQVVWEGLWMSCVVQSTGQMQCKVYDSLLAL 
PQDLQAARALCV I ALLVALFGLLVYLAGAKCTTCVEE KDS KARL 
VLTSG I VFVI SGVLTL I P VCWTAHAV I RDF YN PLVAEAQ KRELG 
ASLYLGWAASGLLLLGGGLLCCTCPSGGSQGPSHYMARYSTSAP 
AISRGPSEYPTKNYV 


66 55 


341 


16 


KD A YM F KKGL LALAL V F S L P VFAAEHW I D VR VP E Q YQQEHVQG A 
INI P LKE VKER I ATAVPDKNDTVKVY CNAGRQ SGQAKE I LS EMG 
YTHVENAGGLKDIAMPKVKG 


6656 


2 


1212 


TELPPRPANLAI QP PLS PLRALAPLPEKPGAVP.P PQKRMAKVAK 
DLNPGVKKMSLGQLQSARGVACLGCKGTCSGFEPHSWRKICKSC 
KCS QEDHCLTS DLE DDRXI GRLLMDS KYS TLTAR VKGGDG I RI Y 
KRNRMI MTNP I ATGKDPTFDTITYEWAP PGVTQKLGLQ YMEL I P 
KEKQPVTGTEGAFYRRRQLMHQLPIYDQDPSRCRGLLENELKLM 
EEFVKQYKSEALGVGEVALPGQGGLPKEEGKQQEKPEGAETTAA 
TTNGSLSDPSKEVEYVCELCKGAAPPDSPWYSDRAGYNKQWHP 
TCFVCAKCSEPLVDLIYFWKDGAPWCGRHYCESLRPRCSGCDEI 
IFAEDYQRVEDLAWHRKHFVCEGCEQLLSGRAYIVTKGQLLCPT 
CSKSKRS 


6657 


B30 


2126 


LLTCQERAGDCLLSASTMKEVVYWSPKKVADWLLENAMPEYCEP 
L EH FTGQDL I NLTQEDF KKP P LCRVS S DNGQR L LDM I ETLKMEH 
HLEAHKNGHANGHLNIGVDIPTPDGSFSIKIKPNGMPNGYRKEM 
I KI PMPELERSQYPME WGKTFLAFL YALSCFVLTTVMISWHER 
VPPKEVQPPLPDTFFDHFNRVQWAFS I CEINGMILVGLWLI QWL 
LLKYKS I ISRRFFCIVGTLYLYRCITMYVTTLPVPGMHFNCSPK 
LFGDWEAQLRR I MKLI AGGGLS I TGS HNMCGDYL YSGHTVMLTL 
TYLF I KEYS PRRLWW YHW I CWLLS WG I FC I LLAHDHYTVDVW 
AYYITTRLFWWYHTMANQQVLKEASQMNLLARVWWYRPFQYFEK 
NVQGIVPRSYHWPFPWPWHLSRQVKYSRLVNDT 


6658 


35 


855 


HCCALGAPGSPYRGLYFSSAAPCTAPRKAKHQSTLEGLTKRMLM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing Bignal peptide 
(A-Alanine, C-Cysteine, D«*Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, TsThreonine, V=Valine, 
W= Tryptophan, Y= Tyrosine, X -Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








FDPVPVXQEAMDPVSVSYPSNYMESMKPNKYGVIYSTPLPEKFF 
QT P EGLSHG I QM EP VDLTVNKRS S P P S AGNS PS S LKF P S SHRRA 
SPGLSMPSSSPPIKKYSPPSPGVQPFGVPLSMPPVMAAALSRHG 
IRSPGILPVIQPVWQPVPFMYTSHLQQPLMVSLSEEMENSSSS 
MQVPVI ESYEKPI SQKKIKI EPG I EPQRTDYYPEEMS PPLMNS V 
SPPQALLQE 


6659 


18 


523 


E PQ RG D CETW FQNCS L P KF VC F FCWGFWLWRAHSMS NLH SLPG L 
RGLTSISRNQLQCTNAMRVINNYQRRWKNQNTFLLATFANWNV 
CGNPTITCPHNRTLNNCHHSGVQVPLMYCNLTTPSPQNISNCRY 
AQT PANMFY I VACDNRDQRRDPPQ YP WPVHLHT 1 1 


6660 


514 


1707 


CAASLDCRHHLCEPDMKLVWPSAKLLQAAAGASARACDSVTSNV 
LPLLLEQFHKHSQSSQRRTILEMLLGFLKLQQKWSYEDKDQRPL 
NG FKDQLCS LVFMALTD PS TQLQL VG I RTLT VLGAQ P DLLS YE D 
LELAVGHLYRLSFLKEDSQSCRVAALEASGTLAALYPVAFSSHL 
VPKLAEELRVGESNLTNGDEPTQCSRHLCCLQALSAVSTHPSIV 
KETLPLLLQHLWQVNRGNMVAQSSDVIAVCQSLRQMAEKCQQDP 
ESCWYFHQTAIPCLIiALAVQASMPEKEPSVLRKVLLEDEVLAAM 
VSVIGTATTHLSPELAAQSVTHIVPLFLDGNVSFLPENSFPSRF 
QPFQDGSSGQRRLIALLMAFVCSLPRNVSEHIWEVLLFNLDKVT 
PG 


6661 


179 


430 


GVHAASGTLSATWIiAEAKMFDSLAKAGKYLGQAAKLMIGMPDYD 
NYVEHMRVNHPDQTPMTYEEFFRERQDARYGGKGGARCC 


6662 


185 


423 


RS L P KPAP AQ PAS I HCAR FSG VT P PTAKTAMS DGNTAFNALM YC 
GPKADDGNI FSACAPASSAVKASVSVAQPGQAVIP 


6663 


3 


1005 


RPVLSSRVDDFVPPLPETSGRRKKLERMYSVDRVSDDIPIRTWF 
PKENLFSFQTASTTMQArSNFRKHLRMVGSRRVKAQTFAERRER 
SFSRSWSDPTPMKADTSHDSRDSSDLQSSHCTLDEAFEDLDWDT 
EKGLEAVACDTEGFVP PKVMLI S S KVPKAEY I PTI IRRDDPS 1 1 
PILYDHEHATFEDILEEIERKLNVYHKGAKIWKMLIFCQGGPGH 
LYLLKNKVATFAK^KEEDMIHFWKRLSRLMSKVNPEPNVIHIM 
GCYILGNPNGEKLFQNLRTLMTPYRVTFESPLELSAQGKQMIET 
YFDFRLYRLWKSRQHSKLLDFDDVL 


6664 


58 


968 


PRLLRLPRSVWMDSPWDELALAFSRTSMFPFFDIAHYLVSVMA 
VKRQPGAAALAWKNP I SS WFTAMLHCFGGGI LS CLLLAEPPLKF 
LANHTNILLAS S I WY I TFFCPHDLVS QGYS YLP VQLLASGMKEV 
TRTWK I VGGVTHANS YYKNGWI VM I AIGWARGAGGTI I TNFERL 
VKGDWKPEGDEWL KMS YPAKVTLLGS VIFTFQHTQHLAI S KHNL 
MFLYTIFIVATKITMMTTQTSTMTFAPFEDTLSWMLFGWQQPFS 
S CE KKS EAKS P SNGVGS LAS KP VD VAS DNVKKKHTKKNE 


6665 


171 


1278 


DERRLACRQWTQQRSELYPGFQKRQRFLPKAGEEAAAQGGRHL 
PGRWLGPGCTQNPCSVHTATGPEPRKLPLLPPDSPNSGYPKEPA 
ALC PG I PS P CRMTHQD LS I TAKL I NGGVAGLVG VTC VFP IDLAK 
TRLQNQHGKAMYKGMIDCLMKTARAEGFFGMYRGAAVNLTLVTP 
EKAI KLAANDFFRRLLMEDGMQRNLKMEMLAGCGAGMCQVVVTC 
PME MLK I QLQ DAGRLAVHHQGSAS AP S TS RS YTTGS AS THRRP S 
ATLI AWELLRTQGLAGL YRGLGATLLRDI PFS 1 1 YFPLFANLNN 
LGFNELAGKAS FAHS FVS GCVAGS I AAVAVTPLDVLKTR IQTLK 
KGLGEDMYSG I TDCAR 


6666 


498 


2868 


MTTFLPVPQMMAGFSFGTFGNPPMESPSAWQTIHQPFIVSCLTL 
WS P G CW PQP I Q KEG VG LW D I R KPQS S LLR YGGNLS LQS AM S VRF 
NSNGTQLLALRRRLPPVLYDIHSRLPVFQFDNQVYFNSCTMKSC 
CFAGDRDQY ILSGSDDFNLYMWRI PADPEAGGIGRWNGAFMVL 
KGHRS I VNQVRFNPHTYM I CSSGVEKI I KIWS PYKQPGCTGDLD 
GRIEDDSRCLYTHEEYISLVLNSGSGLSHDYANQSVQEDPRMMA 
FFDSLVRREIEGWSSDSDSDLSESTILQLHAGVSERSGYTDSES 
SASLPRSPPPTVDESADNAFHLGPLRVTTrNTVASTPPTPTCED 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of j 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=t Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
psproline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y»Tyrosine, X= Unknown , *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








AASRQQRLSALRRYQDKRLIALSNESDSEENVCEVELDTDLFPR 
PRSPSPEDESSSSSSSSSSEDEEELNERRASTWQRNAMRRRQKT 
TREDKPSAPIKPTNTYIGEDNYDYPQIKVDDLSSSPTSSPERST 
STLEIQPSRASPTSDIESVERKIYKAYKWLRYSYISYSNNKDGE 
TS LVTGE ADEGRAGT S H KDN P AP S S S KE ACLN I AMAQRNQDL P P 
EGCSKETFKEETPRTPSNGPGHEHSSHAWAEVPEGTSQDTGNSG 
SVEHPFETKKLNGKALSSRAEEPPSPPVPKASGSTLNSGSGNCP 
RTQSDDSEERSLETICANHNNGRLHPRPPHPHNNGQNLGELEW 
AYSSPGHSDTDRDNSSLTGTLLHKDCCGSEMACETPNAGTREDP 
TDT PATDSSRAVHGHSGLKRQR I ELEDTDSENS SSE KKLKT 


6667 


171 


1310 


AEEVERLAAMRSDSLVPGTHTPPIRRRSKFANLGRIFKPWKWRK 
K KS EK FKHTSAALERKI SMRQSREEL I KRG VLKE I YDKDGELS I 
SNEEDSLENGQSLSSSQLSLPALSEMEPVPMPRDPCSYEVLQPS 
DIMDGPDPGAPVKLPCLPVKLSPPLPPKKVMICMPVGGPDLSLV 
SYTAQKSGQQGVAQHHHTVLPSQIQHQLQYGSHGQHLPSTTGSL 
P^PSGCRMIDELNKTLAMTMQRLESSEQRVPCSTSYHSSGLHS 
GDGVTKAGPMGLPEIRQVPTWIECDDNKENVPHESDYEDSSCL 
YTREEEEEEEDEDDDSSLYTSSLAMKVCRKDSLAIKPSNRPSKR 
ELEEKNILPRQTDEERLELRQQIGTKL 


6668 


714 


358 


TLAVATGPALTLRCHVCTS S SNCKHS WCPAS S RFCKTTNTVE P 
LRGNLVKKDCAESCTPSYTLQGQVSSGTSSTQCCQEDLCNEKIiH 
NAAPTRTALAHSALSLGLALSLLAVILAPSL 




459 


1207 


KDEETRKDYDYMLDHPEEYYSHYYHYYSRRLAPKVDVRW1LVS 
VCA I S VFQFFSWWNS YNKAI S YIATVPKYRIQATE I AKQQGLLK 
KAKEKGKNKKSKEE I RDEEEN IIKN 1 1 KSKIDI KGGYQKPQ I CD 
LliLFQI ILAPFHLCSYIVWYCRWI YNFNIKGKEYGEEERLYI I R 
KSMKMS KSQFDSLEDHQKETFLKRELWI KENYEVYKQEQEEELK 
KKLANDPRWKRYRRWMKNEG PGRLTFVDD 


6670 


184 


594 


VARI*GEAAKMSSEPPPPYPGGPTAPLLEEKSGAPPTPGRSSPA 
VMQ P P PGMPLP PAD I G P P P YE P PGHPM PQ PGF I PPHMS ADGT YM 
PPGFYPPPGPHPPMGYYPPGPYTPGPYPGPGGHTATVLVPSGAA 
TTVTV 


6671 


1 


763 


LPAEKPRS APNMAGGRCGPQLTALLAAW I AAVAATAGPEEAAL P 
PEQSRVQPMTASNWTLVMEGEWMLKFYAPWCPSCQQTDSEWEAF 
AKNGE I LQI S VGKVDVI QE PGLSGRFFVTTLPAFFHAKDG I FRR 
YRGPG I FEDLQNY I LEKKWQS VE PLTGWKSPASLTMSGMAGLFS 
I SGKI WHLHNYFTVTLGI PAWCS YVFFVI ATLVFGLSMDLVL * V 
ISOCNWDPPYRHVS * /RPSTNLGVHTAHTSEHLRL 


6672 


3 04 


1089 


APGSKPVQFMDFEGKTSFGMSVFNLSNAiMGSGILGLAYAMAHT 
GVIFFLALLLCIALLSSYSIHLLLTCAGIAGIRAYEQLGQRAFG 
PAGKWVATVICLHNVGAMSSYLFIIKSELPLVIGTFLYMDPEG 
DW FLKGNLL 1 1 IVSVLI I LPLALMKHLG YLGYTSGLS LTCML F F 
Lvov J. i i\.xs.r [jLAjLiK, i itn i. nxS-V v m-uj v vj i tr^/rt\uo irtrtv 
MFHS* LTGVLTQWP IMAFAFVCHPGGAGPS ITELCRAFQAQD 


6673 


1116 


1963 


LQ I Q THHTKHGAR VTHLG S HQLLANAGTMLCRQQS S S MAP AFS Q 
SVTCGPSPCVRKQESATKCLHIGACGSDLMARGWEQG*G*GLNV 
WLCPCVAFHRGARPQAEEGGARWNSLVSSPWIPPNP*HSSIGAE 
NAVPRP*QG*KVNPSGQERQS\WVLPLPVPGEPLKLPGLPQ*NK 
SFSRV/SGSKGKWILPRQLM*AS*R\TPRFVPGTQWVPITW/PL 
ITWH*SAPTPPLKACPAPRESDPCSSCLSCPCVTQHPRFSDTGW 
FGAGHCHSSCDFTRKGAAGGPG 


6674 


1 


440 


LEFDYMCQYDYVEVRDGDNRDGQ I I KRVCGNERPAP IQS IGSSL 
HVL FHSDGS KNFDG FHAI YEE I TAG S S S P CFHDGTCVL DKAGS Y 
KCACLAGYTGQRCENLLEERNCSDPG/WPSQWVPENNRGPWAYQ 
PTPC* IGTRVAFFLT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine,. C-Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V« Valine, 
W=Tryptophan, YsTyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 


6675 


277 


1678 


GNWPTERMAFLDNPTIILAHIRQSHVTSDDTGMCEMVLIDHDVD 
LEKIHPPSMPGDSGSEIQGSNGETQGYVYAQSVDITSSWDFGIR 
RRSNTAQRLERLR KERQNQI K C KN I QWKERNS KQS AQELKSLFE 
KKSLKEKPPISGKQSILSVRLEQCPLQLNNPFNEYSKFDGKGHV 
GTTATKK ID VYL PLHS S QDRLLPM T WTMAS AR VQDL I GLI CWQ 
YTSEGREPKLNDNVSAYCLHIAEDDGEVDTDFPPLDSNEPIHKF 
GFSTLALVEKYS SPGLTS KESL FVRINAAHGFSIi I QVDNTKVTM 
KEILLKAVKRRKGSQKVSGSRADGVFEEDSQIDIATVQDMLSSH 
HYKS FKVSMIHRLRFTTDVQL/ GCALFPGVLRKRAAPVDCLRPS 
ADTWRQEQ IGCCGAACAALRS * DSHKC* EG I SGD KVE I DPVTNQ 
KASTKFWI KQKP I SIDSDLLCAC\DLAEE 


6676 


277 


l£78 


GNWPTERMAFLDNPTIILAHIRQSHVTSDDTGMCEMVLIDHDVD 
LEKIHPPSMPGDSGSEIQGSNGBTQGYVYAQSVDITSSWDFGIR 
RRSNTAQRLE RLRKERQNQ I KC KN I QWKERNS KQ S AQELKS L FE 
KKSLKEKP P I SGKQS I LS VRLEQCPLQLNNP FNE YSKFDGKGHV 
GTTATKKIDVYLPLHSSQDRLLPMTWTMASARVQDLIGLICWQ 
YTS EGRE P KLNDNVSAYCLH I AE DDGE VDTD F P P LDSNE P I HKF 
GFSTLALVEKYSSPGLTSKESLFVRINAAHGFSLIQVDNTKVTM 
KEILLKAVKRRKGSQKVSGSRADGVFEEDSQIDIATVQDMLSSH 
HYKSFKVSMIHRLRFTTDVQL/GCALFPGVLRKRAAPVDCLRPS 
ADTWRQEQIGCCGAACAALRS*DSHKC*EGISGDKVEIDPVTNQ 
KASTKFWIKQKPISIDSDLLCAC\DLAEE 


6677 


277 


1678 


GNWPTERMAFLDNPTIILAHIRQSHVTSDDTGMCEMVLIDHDVD 
LEKIHPPSMPGDSGSEIQGSNGETQGYVYAQSVDITSSWDFGIR 
RRSNTAQRLERIiRKERQNQIKCKNIQWKERNSKQSAQELKSLFE 
KKSLKEKPPISGKQSILSVRLEQCPliQLNNPFNEYSKFDGKGHV 
GTTAT KKI DVYLP LHS S QDRLLPMTWTMAS ARVQDL I GL I CWQ 
YTS EGRE PKLNDNVSAYCLH I AE DDGEVDTD FP P LDSNE P I HKF 
GFST1ALVEKYSSPGLTSKESLFVRINAAHGFSLIQVDNTKVTM 
KE I LLKAVKRRKGSQKVSGSRADGVFEEDSQ IDIATVQDMLSSH 
HYKS FKVSM I HRLRFTTD VQL/G CAL FPGVLRKRAAP VD CLRP S 
ADTWRQEQ IGC CG AACAALRS * DS HKC * EG I S GDKVE I D P VTNQ 
KASTKFWIKQKPISIDSDLLCAC\DLAEE 


6678 


221 


865 


GPSNQSSGSLSLI VTGC S S Y WS * I NDT CT I LR VLS S NFGRQ * LR 
PFPCSQLPMSQGCLWHLDCCCPWVPYIPGQQWRKGRQRMRN*QS 
LLGSDQESVGLEDLCVFVNFLLHVLLGLFP* PHELFLLPWDLG 
FLFPLLLQGGCHCLVLPANLVSQAPQIGKLSCRLQTHDLEGSRN 
HHPLFLWGRWDAVKHLETVQSGLASLGFVGQHTSHGPP 


6679 


2 


786 


LEFARGAMPFLGQDWRSPGQNWrVKTVDGWKRFLDEKSGSFVSDL 
SSYCNKEVYNKENLFNSLNYD/SCSQEEKEGHAE*QNQNS\DFH 
QEKWIYVHKGSTKBRHGYCTI^EAFNRLDFSTAILDSRRFNYVV 
RLLEL IAKSQLTS LSGI AQKNFMNI LEKWLKVLEDQQN ITLI R 
ELLQTLYTSLCTLVKRVGKSVTjVGNINMWVYRMETILHWQQQLN 
NIQITRVSGQAQPPPGSGSLHRDTGQTRQDFEFTPVTEESGIjF 


6680 


1498 


2951 


plctlplmpsalpgwagerwekqwpla/pgpgtwqtpvgsisee 
p\rknepdthcprgearpev*hlpkphspgsegaeiqtsa*alp 
/nqvsppqpm*gaeengdqrggkeeageelhrsssgltaapgfp 
evhrnlqtfpglp srgggp /ggagtqgs wapgeq p p / s pllpas 
mqrsqaglpgweaglvespthhipalrpsgtnatgeafpsttcs 
sgp\ pappgptglrpgggsssgghg* * pglpvgkv\galgaaqd 
pqsqgrgptqgtvgtemllsglgsakacpaarpavp*lpsdpas 
tipkkgtrgfgegpgvlqernrwwgraqgftsadaagtappgv 
* lpaplsqpfgatepqvracgmapps pgtsgrlvawgrhpgpqv 
aqgcppgagcwgsqprgsqrcprtythsplghgrapcprrcwh* 
wqdppssprtgclpgiparqaysaprtrsrpgirtgraaygfir 

fqggggg 



539 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C*=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I*Isoleucine, K=Lysine, 
LsLeucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine , V=»Valine, 
W=Tryptophan, Y=Tyroeine, X- Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\»poesible nucleotide insertion) 


6681 


1169 


511 


INYIYYNQQQRAFHELK\EKLMSAPALGLPDLTKLFTLHVSERE 
ICMTVGVLTQTVGP W SR PGA YLS KQLDGVS KGWPPCPRALAATAL 
LAQEADELTLRQNLNRKSPHA\VVTLINTKGHH*LINARLTRYQ 
TLLCENPHKTI EVSNT/ LNPATLLLVTE S PVKHNCLE VLDS VYS 
SRPNLRDHP * TS VDWELYVDGSGFANPCKVTLKKETS PAPVTPR 
S 


6682 


109 


1238 


TVLCGAMQVSSLNEVKIYSLSCGKSLPEWLSDRKKRALQKKDVD 
VRRRIELIQDFEMPTVCTTI KVS KDGQY I LATGTYKPRVRCYDT 
YQLSLKFERCLDSEWTFEILSDDYSKIVFLHNDRYIEFHSQSG 
FYYKTRIPKFGRDFSYHYPSCDLYFVGASSEVYRLNLEQGRYLN 
P LQTDAAENNVCD I NS VHGLFATG T I EGRVE CWDPRTRNR VGLL 
D\ AP * TVS QQ I QR * TSLPTIS ALKFN\GALTMAVGTTTGQ VLLY 
DLRSDKPLLVKDHQYGLPIKSVHFQDSI^LILSADSRIVKMWNK 
NSGKI FTSLEPEHDLNDVCLYPNSGMLLTANETPKMGI YYI PVL 
GPAPRWCSFLDNLTEELEENPESNE 


6683 


109 


1238 


TVLCGAMQVSS LNE VK I YS LSCGKS LPEWLSDRKKRALQKKDVD 
VRRR I EL I QDFEMP TVCTT I KVS KDGQY I LATGT YKPRVR C YDT 
YQLSLKFERCLDSEWTFEILSDDYSKIVFLHNDRYIEFHSQSG 
FYYKTRIPKFGRDFSYHYPSCDLYFVGASSEVYRLNLEQGRYLN 
PLQTDAAENNVCD INS VHGLFATGT I EGRVE CWDPRTRNR VGLL 
D \ AP * TVS QQ 1 QR* TS LPTISALKFN\GALTMAVGTTTGQVLLY 
DLR S D KPLL VKDHQ YG L P I KS VH FQD S LDL I LS ADSR I VKMWNK 
NS GKIFTSLEP EHDLND VC L YPNS GMLLTANETP KMG I Y Y I PVL 
GPAPRWCSFLDNLTEELEENPESNE 


6684 


111 


527 


GL RGG T SRGRAGRE P E F AAG VLC WAGFCQ S P CP PGGRGREAP A 
P P \ SGRRHA* RPA* WLGG PGGDSGGRE EGGS /GELQRAMES KMG 
ELPLDINIQEPRWDQSTFLGRARHFFTVTDPRNLLLSGAQLEAS 
RNIVQNYR 


6685 


258 


1473 


KLLGDNFEGFCNKFELSDSENGSNS*QSPL\FDRLFDPDPQKVL 
QGVIDMKNAVIGNNKQKANLIVLGAVPRLLYLLQQETSSTELKT 
ECA WLGSLAMGTENNVKS LLDCH 1 1 PALLQGLLS PDLKF I EAC 
LRCLRTIFTSPVTPEELLYTDATVI PHLMALLSRSRYTQEY I CQ 
IFSHCCKGPDHQTILFNHGAVQNIAHLLTSLSYKVRMQALKCFS 
VLAFENPQVSMTLVNVLVDGELLPQ I FVKMLQRDKP I EMQLTSA 
KCLTYMCRAGAIRTDDNCIVLKTLPCLVRMCSKERLLEERVEGA 
ETLAYLIEPDVELQRIAS ITDHLIAMLADYFKYPSSVSAITD I K 
RLDHDLKHAHELRQAAFKLYASLGANDEDIRKKVSLGEGRPPVL 
TASRQGVTST 


6686 


310 


927 


DSVTFDDLAVDFTPKEWTLLDPTQRNLYRDVMLENYKNLATVGY 
QLFKPSLISWLEQEESRTVQRGDFQASEWKVQLKTKELALQQDV 
LGEPTSSGIQMIGSHNGGEVSDVKQCGDVSSEHSCLKTHVRTQN 
SENTFECYLYGVDFLTLHKKTSTGEQRSVFSHVWKKPSSLNPDV 
VCQKNRCTRKKKAF * LQLTLGKS FH * S I HT 


6687 


181 


915 


EAMLEAPYKKEEDEQQRKEVKKDYPSNTTSSTSNSGNETSGSST 
IGETSNRSRDRDRYRRRNSRSRSPGRQCRHRSRSWDRRHGSESR 
SRDHRREDRVHYRS PPLATGE PVDNLS PEERDARTVFCMQLAAR 
IRPRDLEDFFSAVGKVRDVRI I SDRNSRRSKG IAYVEFCE IQS V 
PLAIGLTGQRLLGVP 1 1 VQAS QAE KNRLAAMANNLQKGNGGPMR 
LYVGSLHFNITEDMLRGIFEPFGKV 


6688 


1025 


1 


AEVPNYPRVFHKCPDSCWRFKFQPIQLQPYILLSFSSEKPPISF 
SEPGLPR/SATARMATAAAPPNSSIDLPSDSGMGFISPAGDSLD 
LPSDGGTGFFSLAGDSSSTRLSSLAFISFSLSSVSVGSSAGTTS 
STSVGSWAAFTSSSSSSTNRDVAGLDFSTVITSVSGSLVPSRE 
VAVICGSKGAGASGSASCSSRAGKTTBATAASSMPSGTSSFSTC 
TMSELEELFSLFSPAPLLS KLFTSSGS I AI CCQDSGPSDTGRLS 
VCQLWLADS DTGKLS DCQEVVTVGDSGGLTCPELSLGRM* MS LL 
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seq ; 

ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine , N=Asparagine, 
P^Proline,' Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y»Tyrosine, X- Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








SSAVIPGYSSSSDSRLNTVPTVDLLCPFQTKSST 


6689 


640 


1299 


S S S AS Y AT SATS I S DTAFSGS LKLKHGLLS ALDS SS RTS * S T S S 
AEDSTFRICSPSVSDTSSDSSGSKDNVLILFSKVSI*SCFSLSS 
FFS DS I S F CFS SSS FCKR* FVS S KVSQNALLS SRLSNGPGGS SK 
QRNSLTARQLAMSL*ATKF*RNACNPNCLSSKKSAL*LSLNQRF 
GGSASRKPGNI SFNSQKCSALS YCCNFVI KPREVSVSSENYPAF 


6690 


1 


442 


GTRGKMAATLGPLGSWQQWRRCLSARDGSRMLLLLLLLGSGQGP 
QQVGAGQTFEYLKREHSLSKPYQGVGTGSSSLWNLMGNAMVMTQ 
YIRLTPDMQSKC<3ALWNRVPCFLRDWELQVHFKIHGQGKKNL\H 
GDGLAIWYTKDRMQP 


6691 


287 


1401 


LKTETSEEKARRYKDRPSQLNAVFQEQKKMIQAQESITLEDVAV 
DFTWEEWQLLGAAQKDLYRDVMLENYSNLVAVGYQASKPDALFK 
LEQGEQLWTIEDGIHSGACSDIWKVDHVLERLQSESliVNRRKPC 
HEHDAFENIVHCSKSQFLLGQNHDIFDLRGKSLKSNLTLVNQSK 
GYE I KNSVEFTGNGDS FLHANHERLHTAI KFPASQKL I STKSQF 
I S PKHQKTRKLEKHHVCSECX3KAF I KKSWLTDHQVMHTGEKPHR 
CSLCEKAFSRKFMLTEHQRTHTGEKPYECPECGKAFLKKSRLNI 
HQK7HTGEKPYICSECGKGFIQKGNLIVHQRIHTGEKPYICNEC 
/GKGFIQKTCLIAHQRFHTER 


6692 


178 


939 


WIKEGELSLWERFCANI IKAGPMPKHIAFIMDGNRRYAKKCQVE 
RQEGHSQG FNKLAETLRWCLNLG ILEVTVYAFS I ENFKRS KSEV 
DGLMDLARQ KFSRLME E KE KLQKHGVCI RVIiGDLHLL P LDLQEL 
IAQAVQAT KN YNKCFLNVCFAYTSRHE I SNAVREMAWGVEQGLL 
DPSD I S ESLLDKCLYTNRS PHPDI LIRTSGEVRLS DFLLWQTSH 
S CLVFQPVLW PE YTFWNLFEAI LQFQMNHS VLQK 


6693 


178 


939 


WIKEGELSLWERFCANIIKAGPMPiCHIAFIMDGNRRYAKKCQVE 
RQEGHS OG FNKLAETLRW CLNLG I LE VTVYAFS I EN FKRS KS E V 
DGLMD LARQ KFS RLMEE KE KLQKHG VC I RVLGDLHLLP LDLQEL 
IAQAVQATKNYNKCFLNVCFAYTSRHE I SNAVREMAWGVEQGLL 
DPSD I S ESLLDKCLYTNRS PHPD IL I RTSGEVRLS DFLLWQTSH 
SCLVFQPVLWPEYTFWNLFEAILQFQMNHSVLQK 


6694 


292 


813 


SLLLHLAP PGA YT P S Q P LS S VSTETAS S VRRQ AAE S RQHE L P VR 
EVHS LGQILPQDGLTAEAG P PEAQDPWGS PG ISLPAAH I GFAAA 
LAVGPSGCHTEP\FDEVWPSLFLGDAYAARDKSKLIQLGITHW 
NAAAGKFQ VDTGAKFYRGMS LEYYG I EADDNPFFDLSVYFLP 


6695 


292 


813 


SLLLHLAPPGAYTPSQPLSSVSTETASSVRRQAAESRQHELPVR 
EVHS LGQ ILPQDGLTAEAGP PEAQDPWGS PGI SLPAAHI GFAAA 
LAVGPSGCHTEP\FDEVWPSLFLGDAYAARDKSKLIQLGITHW 
NAAAGKFQ VDTG AKFYRGMS LEYYG IE ADDNPFFDLSVY FLP 


6696 


1 


782 


PRVRGRVGERWAFLS VP AAMS S EMEP LLLAWS YFRRRKFQLCAD 
LCTQMLE KS P YDQAAW I LKARALTEM VY I DE I DVDQ E G IAEMML 
DENAIAQVPRPGTSLKLPGTNQTGGPSQAVRPITQAGRPITGFL 
RPSTQSGRPGTMEQAIRTPRTAYTARPITSSSGRFVRLGTASML 
TS PDG PF INLSRLNLTKYS QKPKLAKAL I E YI FHHENDVKTALD 
LAALSTEHSQYKDWWWK/DQIEKCYYRVGMYREAEKQIKSS 


6697 


3 


782 


PPLFLRRLNSRALRPGSRKVMAWPASLSGQDVGSFAYLTIKDR 
IPQILTKVIDTLHRHKSEFFEKHGEEGVEAEKKAISLLSKLRNE 
LQTD K P F I PLVEKF VDTD I WNQ YLE YQQS LLNESDG KS RW F Y S P 
W L L V\ E C YMYRR I HEAI \ I QS P P ID Y FD VFKES KEQN F YGS QES 
IIALCTHLQQLIRTIEDLD\ENQLKDEFFKLLQISLWGEISVDL 
S L \ SGG ESS SQNTNVLNS LEDLKP F I LLN DMEHLW S LLSNC K 


6698 


668 


754 


VGSCACAGSCKCKECKCTSCKKSECRAFP 


6699 


325 


492 


EGELP / PARRVLPRAMTAS AQPRGRRPGVGVGVWTS CKHPRCV 
LLGKRKGS VGAGS FQLPGGHLEFGETWE ECAQRETWEEAALHLK 
NVHFASVVNSFIEKElXYHYVTILMKGEVDVTHDSEPKin/EP 
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amino acid 
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Predicted end 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine , G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
]> Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S^Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








ES KR 1 1 YNHAF F FQES KWSGG I LQ 


6700 


1098 


1392 


TQCWRS STPGMRTHFRTQP / RLE CGQGFSQQENGHCMDTNECIQ 
FPFVCPRDKPVCVNTYGSYRCRTNKKCSRGYEPNEDGTACVERT 
LLLGLCNLLGK 


6701 


2 


1485 


AAAG PRTRVRRAAAFEGQPS PSPGLG PTSDKAAAPRTP KRRRLW 
RQRQ /H PAMLCYVTRPDAVLMEVE VE AKANGEDCLNQVCRRLGI 
I EVDYFGLQFTGSKGESLWLNLRNRI SQQMDGLAP YRLKLRVKF 
FVE PHL I LQEQTRHI FFLH I KEALLAGHLLCS PEQAVELS ALLA 
QTKFGDYNQNTAKYNYEELCAKELSSATLNSIVAKHKELEGTSQ 
AS AE YQVLQ I VSAMENYG I E WHS VRDSEGQKLL I GVGPEG I S I C 
KDDFSPINRIAYPWQMATQSGKNVYLTVTKESGNSIVLLFKMI 
STRAASGLYRAITETHAFYRCDTVTSAVMMQYSRDLKGHLASLF 
LNENINLGKKYVFDI KRT S KE VYDHARRAL YNAG WDLVS RNNQ 
SPSHSPLKSSESSMNCSSCEGLSCQQTRVLQEKLRKLKEAMLCM 
VCCEEEINSTFCPCGHTVCCESCAAQLQVGESAAHFCLQPHLSL 
LLTGSRSQVLAR 


6702 


397 


1971 


PLAKFLKLDLVNVLCLPMEDVFLFYRTCFCSMGLGSSCHLSLPK 
RAEALLCSRKATVVRDLVAVRMAEEQEFTQLCKLPAQPSHPHCV 
KNTYRS AQHSQALLRGLLALRDSG ILFD WLWEGRHIEAHR I L 
LAASCD YFKGM FAGG LKEME QEEVL I HGVS YNAMCQ I LH F I YTS 
ELELSLSNVQETLVAACQLQI PEI IHFCCDFLMSWVDEENILDV 
YRLAELFDLSRLTEQLDTYILKNFVAFSRTDKYRQLPLEKVYSL 
LS SNRLEVS CETEVYEGALL YHYS LEQVQADQ I S LHE P P KLLET 
VR F PLMEAE VLQRLHD KLD P S P LRDTVAS ALMYHRNES LQ P S LQ 
SPQTELRSDFQCWGFGGIHSTPS\MSSATRPKYLNPLLGEWKH 
FTASIAPRMSNQGIAVLNNFVYLIGGDNNVQ^FRAESRCWRYDP 
RHNRW FQ I QSLQQEHADLS VCWGRY I YAVAGRD YHNDLNAVER 
YD P ATNS WAYVAPLKRE VYAHAGATLEG KMY I TCGRKGR I T 


6703 


45 


1244 


GVGPRAAAMPLELELCPGRWVGGQHPCFIIAEIGQNHQGDLDVA 
KRM I RMAKECGADCAKFQKS ELEFKFNRKALERP YTSKHSWGKT 
YGEHKRHLEFSHDQYRELQRYAEEVGI FFTASGMDEMAVE FLHE 
LNV PF FKVGS GDTNNFP YLE KTAK/TRGWH S VLRD VCG VQLNDE 
TS S WDVLGRVRTS KE KVLMVLVLD YSGRPMVI S S GMQS MDTMKQ 
VYQ I VKPLNPNFCFLQCTSAYPLQPEDVNLRVI SEYQKLFPDI P 
IGYSGHETGIAISVAAVAIjGAKVLERHITLDKTWKGSDHSASLE 
PGELAELVRSVRLVERALGS PTKQLLPCEMACNEKLGKS WAKV 
KI PEGTILTMDMLTVKVGEPKGYPPEDI FNLVGKKVLVTVEEDD 
TIMEE 


6704 


82 


1007 


TKNTRNRWNSGLGASPASRPTRDPQDPSGRQGELSPVEDQREG 
LE AAPKGPSRE S WHAGQRRTS AYTL IAPN INRRNB I QR I AEQE 
LANLEKWKEQNRAKPVHLVPRRLGGSQSETEVRQKQQLQLMQSK 
YKQKLKREESVRIKKEAEEAELQKMKAIQREKSNKLEEKKRLQE 
NLRREAFREHQQYKTAEFL/RQTEHRIARQKCLSKCCLWPTILN 
MGQKLGLQ\DSLKAEENRKLQKMKDEQHQKSELLELKRQQQEQE 
RAK I HQTEHKK VNlN A r JjL»K-LaAj \Jir\j<ylj£i UO vj\*tU nil pi« o u« o vt 

GI 


6705 


2 


786 


RLCRNSARVPCGWSASRSLGEGAGFIGPLRGPHPRAGGTGTSFT 
SYKRKGGIMSTIAAFYGGKSILITVATGFLGKELMEKLFRTSPD 
LKVI YILVRPKAGQTLQHRVFQI LDS KLFEKVI EVRPNVHEKIR 
AI YADLNQNDFAISKEDMQELLS CTNI I FHCAATVRFDDTLRHA 
VQLNVTATRQLLLMASQMPKLEAFIHISTAYSNCNLKHIDEVIY 
PCPVEPKKIIDSLEW\LDDAIIDEITPKLI2DWPNIYTYTK 


6706 


130 


531 


FTHSSSSHSQEMIiGKIiNMLRNDGHFCDITIRVQDKIFRAHKVVL 
AACSDFFRTKLVGQAEDENKNVLDljHHVTVTGFIPLLEYAYTAT 
LS INTENI IDVLAAASYMQMFSVASTCSEFMKSSILWbTTPNSQP 
EK 
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location 
corresponding 
to first 
amino acid 
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amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cyeteine, D-Aspartic Acid, E- 
Glutamic Acid, F*Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine , N^Asparagine , 
P=Proline, Q^Glutamine , R=Arginine, 
S=Serine, T-Threonine, V=Valine, 
W*Tryptophan, Y=Tyrosine, X=Unknovm , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6707 


2233 


1343 


YWSGIGYELQHFHWRKJHFEKKGPPSTCQERLYESRSRWPCIS* 
G MVWG WTAVNGSW * GGQLRCV CVCTS HS SDS TRSSQRAS KCHS 
FFILSQ*KT*SSWENWVFAKYSRIYSYGHSCSKGRGD*DFK*NV 
SQAR* S R FCGLCNP CGHCGLD I NLRGGS S PWTDKHSCVHNNLLC 
NRRVFSLLCEGPGHCYQGAVCREACAAASPGLDSAAEPHRLCEH 
TD*LPK*GPGYIQHFHCDSNILCILYNISFNLFSYSF*GVARYA 
C * R CH WY FE WL LYNH CG D I LVACL * RRQL* S S Q 


6708 


115 . 


1729 


TVGSWSRSGRSPPVGRQLLLTGRGAQAAGSPQGGMALQVELVPT 
GE 1 1 RWHPHR PCKLALGSDGVRVTMESALTARDRVG VQDF VLL 
ENFTSEAAFIENLRRRFRENLIYTYIGPVLVSVNPYRDLQIYSR 
QHMERYRGVSFYEEPPHLLAVADTVYRALRTERRDQAVMISVES 
GAGKTDATKRLLQLYAETCPAPQRGGAVRDRLLQSNPVLEAFGN 
AKTLRNDNS S RFGKYMDVQFDFKGAP VGGH I LS YUuEKSRWHQ 
NHGERNFHIFYQLLEGGEEETLRRLGLERNPQSYLYLVKGQCAK 
VSS INDKSDWKWRKALTVIDFTEDEVEDIiLS IAASVLHLGNIH 
FAANEESNAQVTTENQLKYLTRLLSVEGSTLREALTHRKI IAKG 
EELLSPLNLEQAAYARDAIAKAVYSRTFTWLVGKINRSIASKDV 
ES PS WRSTTVLGLLD I YG FE VFQHNSFEQFCINYCNEKLQQ LFI 
ELTLKSEQEEYEAEGIAWEPVQYFNNKI ICDLVEEKFKGI I\S I 
LDE\ECLRPGE 


6709 


3 


894 


PPHEHLFPSGERGPFSFLVSRRGLGPGKMGKKGKKEKKGRGAEK 
TAAKMEKKVS KRS RKEEEDLEAL I AHFQTLDAKRTQTVEIiPCPP 
PSPRLNASLSVHPEKDELILFGGEYFNGOKTFLYNELYVYNIRK 
DTWTKVD I PSP P PRRCAHQA\ATVPQGGGQLWVFGGEFAS PNGEQ 
FYHYKDLVT^HlATKTWEQVKSTGGPSbR^bHKnVAW 
GGFHESTRDYIYYNDVYAFNLDTFTWSKLSPSGTGPTPRSGCQN 
I PS LPRAAS S VYGGYSKQRVKKDVDKGTRHSDMF 


6710 


158 


980 


RHKMTNYRVESSSGRAARKMRLALMGPAFIAAIGYIDPGNFATN 
I QAGAS FG YQLLVA/VWANIjMAML IQ 1 IjbAK^ 
RDHYPRPWWFYWVQAE I IAMATDLAEFIGAAIGFKLILGVSLL 
QGAVLTGIATFL I LMLQRRGQKP LEKVI GGLLLFVAAAY I VELI 
FS Q PNLAQLGKGMVT PS L PTS EAVFLAAG VL \ GAT I MPHV I / YI 
WHS SLTQHLHGGS RQQRYSATKWDVAI AMTIAGFVNLAI MATAA 
S ELN F YGHTGVA 


6711 


3 


347 


VTE CKTMTCKMS QLERN I * TMINTLHHYS VKLGHPDTL IHGEFK 
ELVRTDLHN I LMKENKNDQAI * HI MEDLDTNAHMQ 1 1 FKEL I ML 
MAMLTWSYHDNMHDADYGPGQQHRPG 


6712 


118 


578 


PHGQKRTRYPQVRAPGQQPQAQLAMALCLKQVFAXDKTFRPRKR 
FEPGTQRFELYKKAQASLKSGLDLRSVVRLPPGENIDDWIAVHV 
VDFFNRINLIYGTMAERCS*TSCPVMAGGPRYEYRWQDERQYRR 
PAKLSAPRYMALLMDWIESLI 


6713 


2485 


3 


QARGSDSEDGEFEIQAEDDARARKLGPGRPLPTFPTSECTSDVE 
PDTREMVRAQNKKKKKSGGFQSMGLSYPVFKGIMKKGYKVPTPI 
QRKT I PVILDGKDWAMARTGSGKTACFLLPMFERLKTHSAQTG 
ARAL I LS PTRE LALQTLKFTKE 1/3 KFTG L KTAL I LGGDRMEDQ F 
AALHENPDII IATPGRLVHVAVEMSLKLQSVEYWFDEADRLFE 
MGF AEQ LQE 1 I ARLPGGHQTVL FSATLP KLLVE FARAGLTE P VL 
IRLDVDTKLNEQLKTSFFLWEDTKAAVLLHLLHOTVRPQDQTV 
VFVATKHHAEYLTELLTTQRVSCAH I YSALDPTARKINLAKFTL 
GKCSTL IVTDLAARGLDI PLLDNVINYSFPAKGKLFLHRVGRVA 
RAGRSGTAYSLVAPDEIPYLLDLHLFLGRSLTLARPLKEPSGVA 
GVIX3MLGRVPQSVVDEEDSGLQSTLEASLELRGLARVADNAQQQ 
YVRSRPAPSPES I KRAKEMDLVGLGLHPLFSSRFEEEELQRLRL 
VDS I KNYRSRAT I FEINASSRDLCSQVMRAKRQKDRKAIARFQO 
GQQGRQEQQEGPVGPAPSRPALQEKQPEKEEEEEAGESVEDIFS 
E WG R KRQRSG PNRG AKRRR E EARQRDQE FY I P YR PKDFDS ERG 
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ID 
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beginning 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AwAlanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R-Arginine, 
S=Serine, T=Threonine , V* Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion] 








IiSISGEGGAFEQQAAGAVLDLMGDEAQNLTRGRQQLKWDRKKKR 
FVGQSGQEDKKKI KTESGRYI SS S YKRDLYQKWKQKQKI D * S * L 
GRRRG ILTRRRPRTEE VGEARPLAQAGCI PGPHAPRHPLQAESA 
LELKTKQQ I LKQRRRAQKAALS LQRWWPQAALCPQ 


6714 


169 


1416 


NNCQELLP PP PAPMAHI PSGGAPAAGAAPMGPQYCVCKVELSVS 
GQNLLDRDVTSKSDPFCVLFTENNGRWIEYDRTETAINNLNPAF 
SKKFVLDYHFEEVQKLKFALFDQDKSSMRLDEHDFLGQFSCSLG 
TIVSSKKITRPIjUjLNDKPAGKGLITIAAQELSDNRVITLSLAG 
RRLDKKDLFGKSDP FLE FY KPGDDGKWMLVHRTE VI KYTLDPVW 
KPFTVPLVSLCDGDMEKPIQVMCYDYDNDGGHDFIGEFQTSVSQ 
MCEARDSVPLEFECINPKKQRKKKNYKNSGI I ILRSCKINRDYS 
FLD Y I LGGCQLMFTVG I DFTASNGNPLDPS SLHY INPMGTNE YL 
SA I WAVGQ 1 1 QD YDSD KMF P ALG FG AQLP P DWKVS HEFAI NFNP 
TNPFCSGVDG IAQAYSACLP 


6715 


32 


493 


GPAGAESGSLHCLPATVQAIiAGAAHSPHGGQPPRRGPLIGSGMP 
GKPKHLGVPNGRMVLAVSDGELSSTTGPQGQGEGRGSSLSIHSL 
PSG P S S P F PTE EQP VAS W ALS F ERLLQDPLGLAY FTEFLKKE FS 
AENVTFWKACERFQQ I PASDT 


6716 


1 


176 


GAGGPAPRSFGSEEPRAALERDKMSARAAAAKSTAMEETAIWEQ 
HTVTLHRVSLCCSK 


6717 


115 


896 


LFAMSGFENLNTDFYQTSYSIDDQSQQSYDYGGSGGPYSKQYAG 
YD Y S QOGR FVP PDMMQ PQQ P YTGQ I YQ P TQAYT P AS PQ P FYGNN 
FEDEPPLLEE LG INFDH I WQ KTLTVLHPLKVADGS IMNETDLAG 
PMVFCLAFGATLLLAGKI QFGYVYG I S AIGCLGMFCLLNLMS MT 
GVSFGCVASVLGYCLLPMILLSSFAVI FS LQGMVG 1 1 LTAG 1 1 G 
WCSFSASKIFISALAMEGQQLLVAYPCALLYGVFALISVF 


£718 


290 


599 


KQ S S TVPGT I L PS L KWHNSG LCKFPE TGG KMTT FKEGLT F KDVA 
V I FTEE ELGLLD P VQRNL YQD VMLEN FRNLLS VGHH P FKHD VFL 
LEKEKKLDIMKTATQ 


6719 


1 


691 


PTRPEEQDREDGKCHKMBMNPISGNLNCDPIAMSQCSSDHGCET 
DLDS DDDKI EKPNNFMKDS AS QDNGL SRKI SR KR VCS S DS DS S L 
Q WKKS S KARTGLLRI TRRCAATAAN K I KLMS D VEDVSLENVHT 
RSKNGRKKPLHLACTTAKKKLSDCEGSVHCEVPSEQYACEGKPP 
DPDSEGSTKVLSQALNGDSDSEDMLNSEHKHRHTNIHKIDAPSK 
RKSSSVTSSG i 


6720 


3 


822 


HEVAEEAGGTVYPQRGTMPGTKRFQHVIETPEPGKWELTGYEAA 
VP I T E KS NPLTQ DLD KAD AEN I VRLLGQC DAE I FQEE GQ ALS T Y 
QRLYS ES ILTTMVQVAGKVQEVLKEPDGGLWLSGGGTSGRMAF 
LMSVSFNQLMKGLGQKPLYTYIilAGGDRSWASREGTEDSALHG 
I EELKKVAAGKKRVI VIG I S VGLS AP FVAGQMDCCMNNTAVFLP 
VLVGFNPVSMARHPFPPPRILRSLTVFPSLRAPHYQITSLLFSM 
SWTLISE 


6721 


3 


822 


HEVAEEAGGTVYPQRGTMPGTKRFQHVIETPEP^KWELTGYEAA 
VPITEKSNPLTQDLDKADAENIVRLLGQCDAEIFQEEGQALSTY 
QRLYSESILTTMVQVAGKVQEVLKEPDGGLWLSGGGTSGRMAF 
LMS VS FNQLMKGLGQ KPL YTYIj I AGGD RS WASREGTE DSALHG 
I EELKKVAAGKKRVI VIG IS VGLS APFVAGQMDCCMNNTAVFLP 
VLVGFNPVSMARHPFPPPRILRSLTVFPSLRAPHYOITSLLFSM 
SWTLISE 


6722 


1 


390 


RS WSKRTWQAL PMAVLFLLLFLCGTPQAADNMQAI YVALGEAVE 
LP CPS PSTLHGDEHLS WFCS PAAGS FTTLVAQVQVGRPAPDPGK 
PGRESRLRLLGNYS LWLEGS KE EDAGRY W CAVLGQHHNYQNW 


6723 


173 


659 


VCQYCTARMAD FG I S AGQFVAVVWDKS S P VEALKGLVDKLQALT 
GNEGR VS VENI KQLLQSAHKESS FDI I LSGLVPGS TTLHSAE IL 
AEIARILRPGGCLFLKEPVETAVDNNSKVKTASKLCSALTLSGL 
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location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T=Threonine, V-Valine, 
W-=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








VEVKELQREPLTPEEVQSVREHLGHESDNL 


6724 


173 


659 


VCQ YCTARMADFG I S AGQFVAWWDKSS PVEALKGLVDKLQALT 
GNEGRVS VENI KQLLQS AHKES S FD I ILSGLVPGSTTLHS AEI L 
AE I ARI LRPGGCLFLKE PVETAVDKNS KVKTASKLCSALTLSGL 
VEVKELQREPLTPEEVQSVREHLGHESDNL 


€725 


356 


722 


RRRTPPV I LATMDDDLMLALRLQE E WNLQEAE RDHAQES LSLVD 
AS WELVD PT PDLQALFVQFNDQ FFWGQLEAVEVKWS VRMTLCAG 
I CS YEG KGGMCS IRLS EPLLKLRPRKDLVEVFFV 


6726 


98 


714 


HLQKMERKINRREKSKEYEGKHNSLEDTDQGKNCKSTLMTLNVG 
GYLYITQKQTLTKYPDTFLEGIVNGKILCPFDADGHYFIDRDGL 
LFRHVLNFLRNGELLL PEGFRENQLLAQEAE FFQLKGLAEEVKS 
RW E KE QLT P RE TTFL E I TDNHDRS QG LR I FCNAP DF I S K I KS R I 
VLVSKSRLDGFPEEFSISSNIIQFKYFIK 


6727 


1 


831 


FRGMGDER PH YYGKHGTPQKYDPTFKGP I YNRGCTD 1 1 CCVFLL 
LAIVGYVAVGIIAWTHGDPRKVIYPTDSRGEFCGQKGTKNENKP 
YLFYFNIVKCASPLVLLEFQCPTPQICVEKCPDRYLTYLNARSS 
RDFEYYKQFCVPGFKNNKGVAEVLRDGDCPAVLI PSKPLARRCF 
PAI HA YKG VLMVGNETT YE DGHGS RKN I TDLVEGAKKANGVLEA 
RQLAMRI FEDYTVSWYWD I I SLG I AMAMSLLF I I LLRFLAGIMG 
RGMIIMGILVLGY 


6728 


486 


935 


FCSSWLRSLADSSLSWKMFLVGLTGGIASGKSSVIQVFQQLGCA 
VI DVDVMARHWQPGYPAHRRIVEVFGTEVLLENGDINRKVLGD 
L I FNQ P DRR QLLNAI TH PE I R KEMM KE TFKYFLRE PRTS PRGKK 
HVP S ALKEADS LMRRDT 


6729 


259 


1191 


VG LTGAQSGRTAS MGRDQRAVAG P ALRRWLLLGTVTVGFLAQS V 
LAG VKKFDVP CGG RDCSGG CQ CYP E KGGRGQPG PVGPQG YNGP P 
GLQGFPGLQGRKGDKGERGAPGVTGPKGDVGARGVSGFPGADGI 
PGHPGQGGPRGRPGYDGCNGTQGDSGPQGPPGSEGFTGPPGPQG 
PKGQKGEPYALPKEERDRYRGEPGEPGLVGFQGPPGRPGHVGQM 
GPVGAPGRPGPPGPPGPKGQQGNRGLGFYGVKGEKGDVGQPGPN 
GIPSDTLHPIIAPTGVTFHPDQYKGEKGSEGEPGIRGISLKGEE 
GIM 


6730 


784 


1015 


NMVDYYEVLGLQRYAS PED I KKAYHKVALKWHPDKNPENKEEAE 
RKFKEVAEAYE VLSNDEKRD I YDKYGTEGLNE F 


6731 


1 


446 


GIRKRLHGAWPRVEVGCPWETRESEGVHLERPTSPLKNNDEGS 
LDIYAGLDSAVSDSASKSCVPSRNCLDLYEEILTEEGTAKEATY 
NDLQ VE YG KCQ LQMKE LMKKFKE I QTQNFS LINENQ S LKKN I S A 
LIKTARVEINRKDEEI 


6732 


102 


1205 


GRWQRRPPPPSPPLWCLQPGGGSDPQQLTQLRHCLSHSPQDTPW 
AQRQVCYTAATTQAAAPATRNCLPDHSGHRPTPPRSHRHHRQEN 
LGS I KPS SRSTKATS TTMAGDGRRAEAVREGWGVYVTPRAP IRE 
GRGRLAPQNGGSSDAPAYRTPPSRQGRREVRFSDEPPEVYGDFE 
PLVAKERSPVGKRTRLEEFRSDSAKEEVRESAYYLRSRQRRQPR 
PQETEEMKTRRTTRLQQQHSEQPPLQPSPVMTRRGLRDSHSSEE 
DEASSQTDLSQTISKKTVRSIQEAPAVSEDLVIRLRRPPLRYPR 
YEATSVQQKVNFSEEGETEEDDQDSSHSSVTTVKARSRDSDESG 
DKTTRSSSQYIESFW 


6733 


613 


1311 


RS CRC VGMRS RNQGGES ASDGHI S C PKPS I IGNAGE KSLSE DAK 
KKKKSNRKEDDVMASGTVKRHLKTSGECERKTKKSLELSKEDLI 
QLLSIMEGELQAREDVIHMLKTEKTKPEVLEAHYGSAEPEKVLR 
VLHRDA I LAQEKS IGEDVYEKP I S ELDRLEEKQKETYRRMLEQL 
LIAEKCHRRTVYELENEKHKHTDYMNKSDDFTNLLEQERERLKK 
LLEQEKAYQARKE 


6734 


189 


551 


SAAMFPVFSGCFQELQEKNKSLELVSFEEVAVHFTWEEWQDLDD 
AQRTLYRDVMLETYSSLVSLGHCITKPEMIFKLEQGAEPWIVEE 
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to first 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
l Ak&I an i ri^ r«PvHt^in^ n=A<?narf i e Acid E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, RsArginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y«Tyrosine, X-Unknown, **Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








T LN L RLSGG S KKQ VFS G I CHRS L VE LQE VHL V 


6735 


280 


558 


KSRRAGVTKMSNPFLKQVFNKDKTFRPKRKFEPGTQRFELHKKA 
QASLNAGLDLRLAVQLPPGEDLNDWVAVHWDFFNRVNLIYGTI 
XDGCT 


6>736 


195 


C (\ Q 


MMVCT.KTPVPPMDMT VCT flT.TWT.WTTT .T.VPT.QQVT.DT.TTnV\7VT?t?N 
PUN I CjLuN r iviN £j n tr I N J. ivo J_iLj Li 1 IN iJf< T ijixIVrvJUoO V JJir iJ 1 1 U I V X r HIM 

SSSNPYLIRRI EELNKTASGNVEAKVVC FYRRRD I SNTL I MLAD 
KHAKEIEEESET11/EADLTDKQKHQLKHRELFLSRQYESLPATH 
IRGKCSVALLNETESVLSYLDKEDTFFYSLVYDPSLKTLLADKG 


6737 


150 


1209 


PVIMPLHFSPGDIVRPSCCVSSSPKLRRNAHSRLESYRPDTDLS 
REDTGCNLQHISDRENIDDLNMEFNPSDHPRASTIFLSKSQTDV 
REKRKSLFINHHPPGQIARKYSSCSTIFLDDSTVSQPNLKYTIK 
CVALAJYYHIKNFJ3PDGRMLLDIFDENLHPLSKSEVPPDYDKHN 

DEVUfflT VT3 fVDTr TTCH ft f^T T'TitTr'a T^TTT ArVT.TTTJT .T TV IV IT T PI T CV 
rCiVJvUX IKr VKl ur 0/\AUL>lrUlW\± V A.UV HjEiKIjIjX I AH J.D ±V-tr 

ANWKR I VLGAILLASKWDDQAVWNVDYCQ I LKD ITVEDMNELE 
RQ FL ELiLQ FN I NVPS S VYAKYY FDLRSLAEANNLS F P LE PLSR E 
RAHKLEAISRLCEDKYKDLRRSARKRSASADNLTLPRWSPAIIS 


6738 


148 


653 


CACAEQPARAEVGAATALPVRWASGEMAPSGSLAVPLAVLVLLL 

7i Ti TaT TTJ D D Q Ml rD T 7 T r PT~> T7 1 XTTVTD T7 T T tTTmLJfci TT?T?VADUr , D!\ rTlMT. 

WtaAi'WlhloKKajM VKv J. I IJnNWKISijljHWLwn.L fir lAr'WL.Kfv-yiNlj 

QPEWESFAEWGEDLEVNIAKVDVTEQPGLSGRFIITALPTIYHC 
KDGEFRRYQGPRTKKDFINFISDKEWKSIEPVSSWF 


6739 


3 


631 


SWPDMAEEEVAKLEKHLMLLRQEYVKLOKKLAETEKRCALLAAQ 
ANKESSSESFISRLLAIVADLYEQEQYSDLKIKVGDRHISAHKF 
VLAARS DS WS LANLSS T KELDLS D ANPEVTMTMLRW I YTDELEF 
REDDVFLTELMKLANRFQLQLLRERCEKGVMSLVNVRNCIRFYQ 

i/USHLiM/lo 1 JjrUN I UAH 1 lAonH via Ei VHIjVINIUvLj 


6740 


3 


631 


SWPDMAEEEVAKLEKHLMLLRQEYVKI^KlOiAETEKRC^ 
ANKESSS ESFIS RLLAI VADL YEQEQ YSDLKI KVGDRH I SAHKF 
VLAARS DSWS LANLSSTKELDLSDANPEVTMTMLRWI YTDELE F 
REDDVFLTELMKLANRFQLQLLRERCEKGVMSLWvTlNCIRFYQ 
TAEELNASTLMNYCAEI I AS HWVS E VEG VN KAL 


6741 


141 


960 


PLTLPFSSRARAGHTMNTSPGTVGSDPVILATAGYDHTVRFWQA 

YDLNSNNPNP 1 1 S YDGVNKNIASVGFHEDGRWMYTGGEDCTARI 

WDLRS RNLQCQR I FQVNAP I NCV CLH PNQAE L I VGDQSG AI H I W 

DLKTDHNEQL I PE PE VS I TS AHI DPDAS YMAAVNS TLVPFS CLL 

PT.aTnTT.DPRFF'F^TiaPPfJT J.PT JvrOGMC!WWNt.TGC;TGDEVTO 
riiniu x u\4eaj ej r Cij nr\i\.t\\y Ltur J_kriv~ ^vjrw ^ j. v niiu J. j-uijij » i. v 

LIPKTKIP 


6742 


141 


960 


PLTLP F S SRARAGHTMNTS PGT VGS D P VI LATAG YDHTVRFWQA 
HSG ICTRTVQHQDSQVNALE VTPDRSM IAAAVQP VSLGYQH IRM 
YDLNSNNPNP 1 1 SYDGVNKNIASVGFHEDGRWMYTGGEDCTARI 
WDLRS RNLQCQRI FQVNAP INCVCLH PNQAEL I VGDQSGAIH I W 
DLKTDHNEQLIPEPEVS ITSAH IDPDAS YMAAVNSTLVPFSCLL 
PLAIG I LQEGEFESLAPJ^GIJjFLACG^NCYVWNLTGGIGDEVTQ 
LIPKTKIP 


6743 


1 


412 


MHSTQDKSLHLEGDPNPSAAPTSTGAPRKMPKRI SIS KQLASVK 
ALRKCSDLEKAIATTALIFRNSSDSDGKLEKAIAKDLLQTQFRN 
FAEGQETKPKYRE I LS ELDEHTENKLDFEDFM I LLLS I TVMSDL 
LQNIR 


6744 


95 


1343 


RTPARNRCAGCEVLSRFSSPNKASSFALQSAGGGLPAVRALRRD 
RQKVSTVG YGMDE VEQDQHEARLKELFDS FDTTGTG S LGQBELT 
DLCHMLSLEE VAP VLQQTLLQDNLLGRVHFDQFKEAL I LILSRT 
LSNEEHFQEPDCSLEAQPKYVRGGKRYGRRSLPEFQESVEEFPE 
VTV I E P LDEEAR P S H I PAGD CS EHWKTQRS EEYEAEGQLRFWNP 
DDLNAS QSGS S PPQDW I EEKLQEVCEDLG I TRDGHLNRKKLVS I 
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Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H*Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S -Serine, T=Threonine , V= Valine, 
W=Tryptophan , Y=Tyrosine, X«*Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\*possible nucleotide insertion) 








CEQYGLQNVDGEMLEEVFHNLDPDGTMSVEDFFYGLFKNGKSLT 
PSASTPYRQLKRHLSMQSFDESGRRTTTSSAMTSTIGFRVFSCL 
DDGMGHASVERILDTWQEEGIENSQBILKALDFGLDGNINLTEL 
TIiALENELLVTKNS IHQACI 


6745 


1 


5B8 


T FRDQGWAQRRRW LLGCASWES WEAAI AAGPGLPS STARQQNNP 
AAG TEC F AAVWARGTAMGS VLS TDSG KS AP AS ATARALERRRDP 
ELPVTSFDCAVCLEVLHQPVRTRCGHVFCRSCIATSLKNNKWTC 
PYCRAYLPSEGVPATDVAKRMKSEYKNCAECDTLVCLSEMRAHI 
RTCQKYIDKYGPLQELEETA 


6746 


110 


492 


GATGAMAESAPARHRRKRRSTPLTSSTLPSQATEKSSYFQTTEI 
S LWTWAAI QAVEKKMESQAARLQSLEGRTGTAE KKLADCE KMA 
VE FGNQLEGKWAVLGTLLQE YGLLQRRLENVENLLRNRN 


6747 


247 


484 


EAVTFKDVAWFTEEELGLLDLAQRKLYRDVMLENFRNLLSVGH 
QP FHRDTFHFLREE KFWMMD I ATQREGNS VYAGVC 


674 8 


201 


665 


MTT F KEAVT F KDVAWFTEE E LG LLD PAQRKL YRD VMLENFRNL 
LSVGNQPFHQDTFHFLGKEKFWKMKTTSQREGNSGGKIQIEMET 
VPEAGPHEEWSCQQIWEQIASDLTRSQNSIRNSSQFFKEGDVPC 
QI EARLS I SXVQQXPYRCNECKQ 


6749 


95 


719 


RRE V KGG DG VC P RARGS P QSQQ F P S CAGGGE G LQQ S GEALDGAM 
SAGGPCPAAAGGGPGGASCSVGAPGGVSMFRWLEVLEKEFDKAF 
VDVDLLLGEIDPDQADITYEGRQKMTSLSSCFAQLCHKAQSVSQ 
INHKLEAQLVDLKSELTETQAEKWLEKEVHDQLLQIiHS IQLQL 
HAKTGQSADSGTIKAKLSGPSVEELERELKAN 


6750 


3 


428 


SCESRRPGAKWVWASGALPRDTTGLGSEQPSGDVAQSNRATMGT 
TAPGPIHLLELCDQKLMEFLCNMDNKDLVWLEEIQEEAERMFTR 
EFS KEPELMPKTPSQKNRRKKRR I S YVQDENRDP IRRRLSRRKS 
RSSQLSSRR 


6751 


152 


1417 


PTKATEMAGAS VKVAVRVRPFNSREMSRDSKC 1 1 QMSGSTTTI V 
NPKQPKETPKSFSFDYSYWSHTSPEDINYASQKQVYRDIGEEML 
QHAFEG YNVCI FA YGQTGAGKS YTMMGKQEKDQQGI I PQLCEDL 
FSRINDTTNDNM S YS VB VS YME I Y CER VRDLLNP KNKGNLRVRE 
HPLLG P YVEDL S KLAVT S YND I QDLMDSGNKARTVAATNMNETS 
S RSHAVFN 1 1 FTQKRHDAE TNT TTE KVS K I S LVD LAGS ERADS T 
GAKGTRLKE G AN I NKS LTTLG KV I S ALAEMDS G PNKNKKKKKTD 
F I P YRD S VLTWLLRENLGGNS RTAM VAALS P AD I NYDETLS TLR 
YADRAKQIRCNAVINEDPNNKLIRELKDEVTRLRDLLYAQGLGD 
ITDMTNALVGMSPSSSLSALSSRNV 


6752 


24 


1834 


RNCVP PLGC YRS RVKFHSD I KMQ YSHHCEHLLERLNKQREAGFL 
CDCT I V I G E FQ FKAHRNVLAS FS E Y FG AI YR S TS ENNVFLDQ SQ 
VKADGFQKLLEFIYTGTLNLDSWNVKEIHQAADYLKVEEWTKC 
KI KMEDFAFIANPS STE I S S I TGNI ELNQQTCLLTLRDYNNREK 
SE VS TDL I QANP KQGALAKKS S QTKKKKKAFNS P KTGQN KTVQ Y 
PSDILENASVELFLDANKLPTPWEQVAQINDNSELELTSWEN 
TFPAQD I VHTVTVKRKRGKSQPNCALKEHSMSN I AS VKS P YEAE 
NSGEE L DQR YS KAKPMCNT CG KVFS EAS S LRRHMR IHKG VKP YV 
CHLCGKAFTQCNQLKTHVRTHTGEKPYKCELCDKGFAQKCQLVF 
HSRMHHGEEKP YKCD VCNLQ FATSSNLK I HARKHSGEKP YVCDR 
CGQRFAQASTLTYHVRRHTGEKPYVCDTCGKAFAVSSSLITHSR 
KHTGEKPFI CELCGNS YTD I KNLKKHKTKVHSGADKTLDS SAED 
HTLSEQDSIQKSPLSETMDVKPSDMTLPLALPLGTEDHHMLLPV 
TDTQSPTSDTLLRSTVNGYSEPQLIFLQQLY 


6753 


2 


1305 


VPSLPYPPOKWAHTEFTTSSDSETANGIAKPDPVMPGGEEKAS 
PFG IKLRRTNYSLRFNCDQQAEQKKKKRHS STGDS ADAGP PAAG 
S ARGE KEME G VALKHG P S L PQE RKQAP S TRRDS AE PS S S RS VP V 
AH PGP P P AS SQTPAPEHDKAANKMPLAQKPALAP KPTSQTP PAS 
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Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=- Phenyl alanine, G-Glycine, 
H-Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
WsTryptophan, Y= Tyrosine , X= Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLSKLSRPYLVELLSRRAGRPDPEPSEPSKEDQESSDRRPPSPP 
GPEERKGQKRDEEEEATERKPASPPLPATQQEKPSQTPEAGRKE 
KP MLQSRH S LDGS KLTE KVETAQP LWI TLALQKQKGFREQQATR 
EE R KQ AR E AKQ AEKLS KENVS VS VQ PG S S S VS RAGS LHKS TAL P 
EEKRPETAVSRLERREQLKKANTLPTSVTVE I S YSS PAAPLVKE 
VS KRFSSPDDAPVSSEPAWLALAKRKAKAMSDCPLI IK 


6754 


2 


413 


FVRRRRRRLGGPE VNTMSSLHKSR I ADFQDVLKEPS IALEKLRE 
LSFSGlPCEGGLRCLCWKILLNyLPLERASWTSILAKQRELYAQ 
FLREMlIQPGIAKANMGVSREDVTFEDHPIiNPNPDSRWNTYFKD 
NEVLL 


6755 




1343 


PGLQLQVALEADWFLDMPGGRRGPSRQQLSRSALPSLQTLVGGG 
CGNGTGLRNRNGSAIGLPVPP I TALITPGPVRHCQI PDLP VDGS 
LLFEFLFF I YLLVALFI QY IN I YKTVWWYPYNHPASCTSLNFHL 
ID YHLAAF I TVMLARRLVWAL I SEATKAGAASM I HYMVLI SARL 
VLLTLCGWVLC WTLVNLFRSHS VIiNLLFLGYP FG VYVP LC C FHQ 
DSRAHLLLTDYNYWQHEAVEESASTVGGLAKSKDFLSLLLESL 
KEQFNNATP I PTHSCPLS PDLIRNE VECLKADFNHRIKEVLFNS 
LFSAY YVAFLP LCFVKVSGYLTFMCFLDLCVNY INWVFLV 


6756 


180 


754 


IERALGSLPLS I PVSWGSLRTLKYQQQPLRPKVLLCQTRVQCHD 
LRSLQPQPPGLKQSFCLRVLGLQTGATTPGLRDLTCKELIILTE 
RE AQ KRKKR KE KES GMALTQG PLT FRDVAI E FS QEE WKSLD PVQ 
KALYWDVMLENYRNLVFLGKDNFALEVKICPRVFbYFLCCLSWE 
PFHYLTETEALLTHK 


6757 


2 


459 


NSRVEAPEAHSRESQGSDAMRKHLSWWWLATVCMLLFSHLSAVQ 
TRG I KHR I KWNR KAL PS TAQ I TEAQ VAENRPGAF I KQGRKLD I D 
FGAEGNRYYEANYWQFPDGIHYNGCSEANVTKEAFVTGCINATQ 
AANQGE FQKPDNKLHQQ VLW 


6758 


1 


1008 


ASGPELPGRRFRDRAPWLPARLiLRGVliAWJVSLSALGPGS FCRR 
RVPSLAQIjGHS EAAPSPDDVRWSRVPDRCPEERDRAWPPP PP P S 
LP PS FRRNMANNS P ALTGNS QPQHQAAAAAAQQQQQ CGGGGATK 
PAVSGKQGNVLPLWGNEKTMNLNPMILTNILSSPYFKVQLYELK 
TYHEWDE I YFKVTHVEPWEKGSRKTAGQTGMCGGVRGVGTGG I 
VSTAFCLLYKLFTLKLTRKQVMGLITHTDSPYIRALGFMYIRYT 
QPPTDLWDWFESFLDDEEDLDVKAGGGCVMTIGEMLRSFLTKLE 
W FSTLF PR I PVPVQKN IDQQ I KTRPRKI 


6759 


1 


513 


RKHNFHS LDGTSTRAFHPQTGLPLLSS PVPQRKTQS GCFDLDS S 
LLHLKSFSSRSPRPCLNIEDDPDIHEKPFLSSSAPPITSLSLLG 
NFEESVLNYRFDPLGI VDG FTAE VGASGAFCPTHLTLPVEVS FY 
SVSDDNAPSPYMGVITLESLGKRGYRVPPSGTIQWCVL 


6760 


233 


606 


VLSKKKGLSAEEKRTRMMEIFSETKDVFQLKDLEKIAPKEKGIT 
AMSVKEVLQSLVDIXSMVIXZERIGTSNYYWAFPSKALHARKHKLE 
VLESQLS EGSQKHASLQKS I EKAKIGRCETEERT 


61*1 


29 


1733 


ERTLRGLREVAAPSDVADAAVSRRGRCCCCLHCTQTQVAQDCPS 
S SSS VQRCELSLFQS LHTMTS KKLVNS VAGCADDALAGLVACNP 
NLQLLQGHRVALRSDLDSLKGRVALLSGGGSGHEPAHAGFIGKG 
MLTGVIAGAVFTSPAVGSI LAAI RAVAQAGTVGTLL I VKNYTGD 
RLN FGLAREQARAEG I PVEMWI GDDSAFTVLKKAGRRGLCGTV 
LIHKVAGALAEAGVGLEEIAKQVNVVTKAMGTLGVSLSSCSVPG 
S KPTFELS ADEVELGLG I HGEAGVRRI KMATADE I VKLMLDHMT 
NTTNASHVP VQ PGS S WMMVNNLGGLS FLELG 1 1 AD AT VR S LEG 
RGVKIARALVGTFMSALEMPGI SLTLLLVDEPLLKLIDAETTAA 
AW PNVAAVS I TGRKRS RVAPAE PQEAPDSTAAGGS AS KRMALVL 
BRVCSTLLGLEEHLNALDRAAGDGDCGTTHSRAARAIQEWLKEG 
PP PAS PAQLLS KLSVLLLEKMGGSSGAL YGLFLTAAAQ PLKAKT 
SLPAWSAAMDAGLEAMQKYGKAAPGDRTMLDSLWAAGQEL 
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Amino acid segment containing signal peptide 
(A=Alanine, C=*Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F«Phenyl alanine, G=Glycine, 
H*Histidine, I=Isoleucine , K=Lysine, 
L^Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S*Serine, T=Threonine, V=Valine, 
WaTryptophan, YsTyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 


6762 


3 


613 


ASTISWRLCVAGAEARRPVPVAGERAGGGAMWFMYLLSWLSLFI 
QVAFITLAVAAGLYYLAELI EE YTVATSR 1 1 KYMIWFS TAVLIG 
LYVFERFPTSMIGVGLFTNLVYFGLLQTFPFIMLTSPNFILSCG 
LVWNHYLAFQFFAEEY YPFSEVLAYFTFCLW 1 1 PFAFFVSLSA 
GENVLPSTMQPGDDWSNYFTKGKRGK 


67*3 


2 


760 


SGPDFPGRRFRGCCCVRPPAGAGMELGGHWDMNSAPRLVSETAE 
RKQEQKTGTEAEAADSGAVGARRFLLCLYLGGFLDLFGVSMWP 
LLSLHVKSLGAS PTVAGIVGSS YG ILQLFSSTLVGCWSDWGRR 
SSLLACI LLSALGYLLLGAATNVFLFVLARVPAGIFKHTLS ISR 
ALLSDWPEKERPLVIGHFNTASGVGFILGPVVGGYLTELEDGF 
YLTAF I CFLVFI LNAGLVWFFPRREAKPGSTE 


6764 


80 


438 


LKKMDTMMLS VRNLFEQLVRRVE I LS EGNEVQF I QLAKDFEDFR 
KKWQRTDHELGKYKDLIjMKAETERSAXjDVKLKHARNQVDVEIKR 
RQRAEADCEKLERQIQLIREMLMCDTSGSIQ 


6765 


3 


550 


arysrvdhfcrrrcravaraprfllqfpsgpsrhflaacvarwl 
rgs vl vseals gs amdg i vte va vg vkrgs dell sgs vls s pns 
nmssmwtangndskkfkgedkmdgapsrvlhirklpgevtete 
vialglpfgkvtnilmlkgknqaflelateeaaitngnyysavt 

PKLRNQ 




1 


1287 


eggsfkasltwlwplgemklhcevevisrhlpalglrnrgkgvr 
avlslcqqtsrsqppvrafllistlkdkrgtryelrenieqfft 
kfvdegkatvrlkeppvdiclskanssslkgflsamrlahrgcn 
vdtpvstltpvktsefenfktkmvitskkdyplsknfpyslehl 
qtsycglvrvdmrmlclkslrkldlshnhikklpatigdlihlq 
elnlndnhles fsvalchstlqkslwsldlsknki kalpvqfcq 
lqelknlklddneliqfpckigqlinlrflsaarnklpflpsef 
rnlsleyldlfgntfeqpkvlpviklqapltllessartilhnr 
ipygshiipralcqdldtakicvcxsrfcuisfiqgtttmnlhsv 

AHTWLVDNLGGTEAP 1 1 S YFCSLGCYVNSSDI 


6767 


336 


919 


apmiclcssdlqfrykeaflrdrglqigycsvdddprmkhflnv 
grlqsdneykkdfaksrsqfhsstdqpgllqakrsqqlasdvhy 

RQ PLPQ PTCDP EQLGLRHAQKAHQLQS DVKYKS DLNLTRGVGWT 
P PG S Y KVEMARRAAE LAN AR G LG LQG A YR G AE A VE AG DHQS G E V 
NPDATE I LHVKKKKALLL 




2 


363 


PGSTISCYLLSEGSLPLCMQVACGEEjKHRAPTMKTLRARFKKTE 
LRLSPTDLGSCPPCGPCP I PKPAARGRRQSQDWGKSDERLLQAV 
ENNDAPRVAALIARKGLVPTKLDPEGKSAFHL 


6769 


284 


396 


MSTPDFSTAEl^QEIANEVSCLiKAMLTLMLQAMGQAD 


6770 


1 


397 


QPJfVOVIWSSTMAKLriDYYKDEVVKKLMTEFNYNSVMQVPRVEK 
I TLNMGVG EA I ADKKLLDNAAADLAA ISGQKPLI TKAR K£ VAG F 
K1RQGYP I GCKVTLRGERMWE FFERLI TI AVPR I RDFRGLSAKS 


Sill 


i 3 


378 


APAGTLAMTGKSVKDVDRYQAVLANLLLEEDNKFCADCQSKGPR 
WAS WNI GVFI C I RC^GIHRNLGVHISRVKS VNLDQwrQEQIQCM 
QEMGNGKANRLYEAYLPETFRRPQIDPYLFWSNLEG 


6772 


1 


1400 


AAAFLQGMTVNGF INTV I TS L \ERRYDLHS YQSGL IAS S YD I AA 
CLCLTFVSYFGGSG\HKPRWLGWGR\VLMGTGSLVFALPHFTAG 
P**GWKLDAGVRTCPANPR\PVCAG\HTSGLSRYQLVFMLGQFL 
HGVGATPLYTLGVTYLDENVKSSCSPIYIAIFYTAAILGPAAGY 
L I GGALLN I YTEMGRRTELTTESPLWVGAWWVG F LG S GAAAF FT 
AVPILGYPRQLPGSQRYAVMRAAEMHQLKDSSRGEASNPDFGKT 
IRDLPLSIWLLLKNPTFILLCLAGATEATLITGMSTFSPKFLES 
QFSLSASEAATLFGYLWPAGGGGTFLGGFFVNKLRLRGSAVIK 
FCLFCTWS LLG I LVFSLHCPS VPMAGVTAS YGGSLLPEGHLNL 
TAPCNAACSCQPEHYSPVCGSDGLMYFSLCHAGCPAATETNVDG 
Q KVYRDCS C I PQNLS SGFGHATAGKCTST 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, FsPnenylalanine, G=ijiycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L^Leucine, M=Piecnionine , iNsAspataymc, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W-Tryptophan, Y-Tyrosine, X -Unknown, *=Stop 
Codon, /=possioi.e nucieotiuc QCictAuu/ 
\=possible nucleotide insertion) 


6773 


1 


630 


PWEAPKE H KYKAE EHT WLT VTGE P CHF P FQ YHRQ L YH KCTHKG 
RPGP Q P W CATT PN FDQDQRWG YCLE P KKVKDHCS KH S P CQKGGT 
CVNMPSGPHCLCPQHLTGNHCQKEKCFEPQLLRFFHKNEIWYRT 
EQAAVARCQCKGPDAHCQRLASQACRTNPCLHGGRCLEVEGHRL 
CHCP VG YTG PFCDVGE * GSGASRRPAPRWDGLAR 


6774 


146 


389 


LTELSDQQYFLFFILSS/WVPTFLSMDVDGRVIKADSFSKIISS 
GLRIGFLTGPKPLIERVILHIQVSTLHPSTFNQLMISQ 


6775 


104 


614 


TC PS QLRVLT ARGGRRAPS P QLWTLVLAL I E E KWRS HR I LRMNS 
GRPETMENLP ALYT I FQGEVAMVTDYGAF I KI PGCRKQGLVHRT 
HMS S CR VD KPS E I VD VGDKVWVXL I GREMKNDR I KVS LS MKWN 
QGTGKDLDPNNV\SLSKKRGOGDPSRITIiGRRSPLRLS 


6776 


3 


1108 


HERHERHEGALSQDALLRISIPLDSNMRPEKCRRFVHPQWQLLH 
LNGTFPNTSDADMEPCVDGWVYDRI S FS ST I VTEWDLVCDSQSL 
TS VAKFVFMAGMMVGG I LGGHLS DRFGRRFVLRWCYLQVAI VGT 
CAAIiAPTFLIYCSLRFLSGIAAMSLITNTIMLIAEWATHRFQAM 
GITLGMCPSGIAFMTLAGLAFAIRDWHILQLWSVPYFVIFLTS 
SWLLESARWLIINNKPEEGLKEIiRKAAHRSGMKNARDTLTLEIL 
KSTMKKEIiEAAQKKKP FLGERLHM PN I C KR I S LLPFTKFANFMA 
YFGLNLHG / LKHLGNNVFLLOTLFGAV/ TP PGQLVLHLGHWGSG 
RVSSRGRVNCLGLFVLQVW 


6777 


779 


63 

- 


CFFHGPAWRDCEVRATFAKKQGQSGIISCIAFSPAQPLYACGSY 
GRS LGLYAWDDGSPLALLGGHQGG ITHLCFHP DGNRFFSGARKD 
AELLCWDLRQSGYPLWS LGREVTTNQR I YFDLDPTGQFLVS GST 
SGAVSVWDTDGPGNDGKPEPVLSFLPQKDCTNGVSLHPSLPLLG 
H CLP VS VCFLS PTESGGRRRGAG P S LGS P RRHVHLE CRLQLWWC 
GGGARLQHP * * S PRARKGR 


6778 


311 


805 


IQS ITDESRGS IRRKNPANTRLRLNVP\EETAGDSE/ERSPEEE 
VQADPRIRSASPKCPTSSPFPKGRSPEGEGET\DPEKVHFHPGP 
KDKSVAEKN\KGP\SPVSSEGIKDFFSMKPEWENLNQSNVRRMH 
T\AVRLNEVIVKKSRDAKLVLLNMPGPPRNRNGDENY 


6779 


2 


535 


RALRRQPRLLAANGIEPESMAISEPIKGSRKPCVNKEELALKKP 
MAKCAWKGPREPPQDARAEAESPGGASESDQDGGHESPPKKKAV 
AWVSAKNPAPMRKKKKVSLGPVS YVLVDS BDGRKKPVMP KKGPG 
SRREASDQKAPRGQQPAEATASTSRGPKAKPEGSPRRATNESRK 
V 


6780 


3 


403 


HEVNDNKPEININLMSPGKEEISYIFEGDPIDTFVALVRVQDKD 
SGIiNGE I VC KLHGHGH FKLQKT YENNYLI LTNATLDRE KRS E YS 
LTVIAEDRGTPSLSTVKHFTVQINDINDNPPHFQRSRYEFVISE 
K 


6781 


1 


1269 


APTRPVFPTLQDLSSSKEPSNSLNLPHSNELCSSLVHPELSEVS 
SNVAPSlPPVMSRPVSSSSISTPIiPPNQITVFVTSNPITTSANT 
SAALPTHI^SALMSTVVTMPNAGSKVMVSEGQSAAQSNARPQFI 
TPVF INSSS 1 1 Q VMKGS QPSTI PAAPLTTNSGLMPPSVAWGPL 
HIPQNIKFSSAPVPPNALSSSPAPNIQTGRPLVLSSRATPVQLP 
SPPCTSSPWPSHPPVQQVKELNPDEASPQVNTSADQNTLPSSQ 
STTMVSPLLTNSPGSSGNRRSPVSSSKGKGKVDKIGQILLTKAC 
KKVTGSLEKGEEQYGADGETEGQGLDTTAPGLMGTEQLSTELDS 
KTP TP P AP TLL KMTSS P VG PGTAS AG PS LPGG ALP TS VR S I VTT 
LVPSELISAVPTTKSNHGGIASESLAG 


6782 


3 


1327 


" ' RKPTVIRIPAKPGKCLHEDPQSPPPLPAEKPIGNTFSTVSGKLS 
NVERTRNLESNHPGQTGGFVRVPPRLPPRPVNGKTIPTQQPPTK 
VPPERP P P PKLS ATRRSNKKL PFNRS S S DMDLQKKQ SNLATGLS 
KAKSQVFKNQDPVLPPRPKPGHPLYSKYMLSVPHGIANEDIVSQ 
N PGE LS CKRGDVLVMLKQTENNYLBCQKGEDTGRVHLSQMKL I T 
PLDEHLRSRPNPFSPPKAPSHAQKPVDSGAPHAWLHDFPAEQV 
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SEQ 

ID 1 
NO: 


Predicted 
Beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
locatiyii 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline f Q=Glutamine, R=Arginine, 
S-Serine, T=Threonine, V«Valine, 
W-Tryptophan, Y-Tyrosine, X-Unknovn, *»Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








DDLNLTSGEIVYLLEKIDTDWYRGNCRNQIG I FPAN YVKVI ID I 
PEGGNGKRECVSSHCVKGSRCVARFEYIGEQKDELS FSEGEI 1 1 
LKEYVNEEWARGE VRGRTGI FPLNFVE PVEDYPTSGANVLSTKV 
PLKTKKEDSGSNSQVNSLPAEWCEALHSFTAETSDDLSFKRGDR 

I 


6783 


3 


1750 


S YHHHHAQQS AAAS PNLTAS Q KTVTTT SM I TTKTL P LVL KAATA 
TMPAS WGQRPT I AMVTAINSQKAVLSTDVQNTPVNLQTS S KVT 
GPGAEAVQ I VAKNTVTLQ VQATP PQPIKVPQFIPP PRLT PRPNF 
LPQVRPKPVAQNNIPIAPAPPPMLAAPQLIQRPVMLTKFTPTTL 
PTSQNS I HPVRVVNGQTATIAKTFPMAQLTS IVI ATPGTRLAGP 
QTVQLSKPSLEKQTVKSHTETDEKQTESRTITPPAAPKPKREEN 
PQKLAFMVSLGLVTHDHLEE I QSKRQERKRRTTANPVYSGAVFE 
PERKKSAVTYLNSTMHPGTRKRGRPPKYNAVLGFGALTPTSPQS 
SHPDSPENEKTETTFTFPAPVQPVSLPSPTSTDGDIHEDFCSVC 
RKSGQLLMCDTCSRVYHLDCLDPPLKT I PKGMW I CPRCQDQMLK 
KEEAIPWPGTLAIVHSYIAYKAAKEEEKQKLLKWSSDLKQEREQ 
LEQKVKQLSNS I S KCMEMKNT ILARQKEMHSSLEKVKQL I RL I H 
G I DLS KPVDS EATVGAI SNGPDCTPPANAATSTPAPS PSS QS CT 
ANCNQGEETK 


6784 


3 


1750 


SYHHHHAQQSAAASPNLTASQKTVTTTSMITTKTLPLVLKAATA 
tm v a wr/)P PT T AMVTAI NS OKAVLS TD VQNT P VNLQTS S KVT 
GPGAEAVQIVAKNTVTLQVQATPPQPIKVPQFIPPPRLTPRPNF 
LPQVRPKPVAQNNIPIAPAPPPMLAAPQLIQRPVMLTKFTPTTL 
PTSQNS IHPVRWNGQTATIAKTFPMAQIjTS I VIATPGTRLAGP 
QTVQLS KPSLEKQTVKSHTETDEKQTESRTI TPPAAP KPKREEN 
PQKLAFMVS LGLVTHDHLEE I QS KRQERKRRTTANPVYSGAVFE 
PERKKSAVTYLNSTMHPGTRKRGRPPKYNAVIGFGALTPTSPQS 
SHPDSPENEKTETTFTFPAPVQPVSLPSPTSTDGDIHEDFCSVC 
RKSQQLIjMCDTCSRVYHLDCLDPPLKT I PKGMW I CPRCQDQMLK 
KEEAIPWPGTLAIVHSYIAYKAAKEEEKQKLLKWSSDLKQBREQ 
LEQKVKQLSNS I S KCMEMKNT I LiARQKEMHS SLEKVKQLIRL I H 
GIDLS KPVDS EATVGAISNGPDCTPPANAATSTPAPSPSSQSCT 

ANCNQGEETK 


6785 


1 


528 


LGNTVLHYC S M Y S KP E CLKLLLRS KP T VD I VNQAG ETALD I AKR 
LKATQCEDLLSQAKSGKFNPHVHVEYEWNLRQEEIDESDDDLDD 
KPSPVKKERSPRPQSFCHSSSISPQDKLALPGFSTPRDKQRLSY 
GAFTNQIFVSTSTDSPTSPTTEAPPLPPRNAGKGPTGPPITPHR 


6786 


1820 


1397 


" RSPKVLVIAPTRELiANHVSRDFKpi \TRKLTVARFYGGTSYQSQ 
INHIRNGIDILVGTPGRIKDHLQSGRLDLSKLRHWLDEVDQML 
DLGFAEQVEDIIHESYKTDSEDNPQTLLFSATCPQWVYTVA\KK 
YMKS RYEQ VDLDG KMTQKAATTVEHLA1 QCHWS QRP AVI GDVLQ 
VYSGSEGRAIIFCETKKNVTEMAMNPHIKQNAQCLHGDIAQSQR 
E ITLKGFREGSFKVLVATNVAARGLD I PEVDLVXQSS PPQDVES 
Y I HRS GRTGRAGRTG I C I CF YQ P RERGQLR YVEQKAG I T FKR VG 
VPSTMDLVKS KSMDAI RSLASVS YAAVDFFRPSAQRL I EEKGAV 
DAIAAALAHISGASSFEPRSLITSDKGFVTMTLESLEEIQDVSC 
AWKELNRKLSSNAVSQ I TRMCLLKGNMGVCFDVPTTESERLQAE 
WHDSDWILSVPAKLPEIEEYYDGNTSSNSRQRSGWSSGRSGRSG 
RSGGRSGGRSGRQSRQGSRSGSRQDGRRRSGNRNRSRSGGHKRS 
FD * VFYHLVDFLSDFLVDSVYLTGRQ I DHLTGLTGL I DHLTSHS 
SVWN 


6787 


2646 


2270 


PSSFPKNVPLEELEEPPK*KRSGLGSLTPKSQIQNGP*PQTFFF 
FELGSPSGVISAKCNLRLLGSSDSPAPASRVAGIIGTCHHAWLI 

lvflvemgfhhvgqaglklltlWihppwppkvlglqt 


6788 


16 


936 


- ggwdlr\dmlavsvlaavrggr/atvrrvresnvlhekskgkt 
regaedkmtsgdvlsnrkmfyllktafpsvqinteehvd\eldq 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

amino acid 1 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R*=Arginine, 
S^Serine, T*Threonine, V=Valine, 
W-Tryptophan, Y«Tyroeine, X«Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








EVILWGS * DS *GYPKGK* LLPKEVPSR/RVLLSGLTPLDATQE\ 
FTEDLSK\ YVTTMVCVAVNGKPMLGV I HK£ FSE YTAWAMVDGGS 
NVKARS S YNE KTP R I WS RS HSGMVKQ VALQTFGNQTTI I PAGG 
AGYTCV1ALLDVPDKSQEKADLYIHVTYIKKWDICAGNAILKALG 
GHMTTLS GEE I S YTGS DG 1 EGG LIAS I RMNHQALVRKLPDLE KT 
GHK 


6789 


2 


678 


"GNG I NVLKI APESAI KFMAYEQ I KRLVW * * PGDS * GF/ YERLVA 
GSLAGAI AQSS I YPMEVLKTRMALRKTGQYSGMLDCARR I LARE 
G VAAFYKG YV PNMLG 1 1 P YAG I DLAVYET LKKAWLQH YAVNS AD 
PGVFVLLACGTMSSTCGQLASYPLALVRTRMQAQAS IEGAPEVT 
MSSLFKHILRTEGAFGLYRGLAPNFMKVI PAVS IS YWYENLKI 
TLGVQSR 


6790 


2 


4068 


AP PAGRRRMQAAPRAGCGAALLLWI VS S CLCRAWTAPS TSQKCD 
EPLVSGLPHVAFSSSSSISGSYSPGYAKINKRGGAGGWSPSDSD 
HYQWLQVDFGNRKQ I SAIATQGRYSS SDWVTQYRMLYS DTGRNW 
KP YHQDGN I W AFPGN I NS DGWRHELQH P 1 1 AR YVR I VPLDWNG 
EGR IG LR I E VYGCS YW ADVI N FDGHWL P YR FRNKKMKTLKDV I 
ALNFKTSE S EGVI LHGEGQQGDYI TLELKKAKLVLS LNLGSNQL 
GPIYGHTSVMTGSLLDDHHWHSWIERQGRSINLTLDRSMOHFR 
TNGEFDYLDLDYE I TFGGI PFSGKPSSSSRKNFKGCMES INYNG 
VN I TDLARRKKLEP SNVGNLS FS CVE P YTV P VFFNATS YLEV PG 
RLNQDLFSVSFQFRTWNPNGLLVFSHFADNLGNVEIDLTESKVG 
VHINITQTKMSQIDISSGSGLNDGQWHEVRFLAKENFAILTIDG 
DE AS AVRTN S PLQ VKTG EKYF FGG FLNQMNN S SHS VLQPS FQGC 
MQLI QVDDQLVNLYEVAQRKPGS FANVS I DMCAI I DRCVPNHCE 
HGGKCSQTWDSFKCTCDETGYSGATCHNSIYEPSCEAYiCHLGQT 
S NY YW IDPDGSGPLG PLKVYCNMTEDKVWTI VSHDLQMQTPWG 
YNPEKYSVTQLVYSASMDQISAITDSAEYCEQYVSYFCKMSRLL 
NTPDGSPYTWWVGKANEKHYYWGGSGPGIQKCACGIERNCTDPK 
YYCNCDADYKQWRKDAGFLSYKDHLPVSQVWGDTDRQGSEAKL 
S VG PLRCQGDRNYWNAAS FPN PS S YLHF ST FQGETS AD I S FY F K 
T LT P WG VFLENMGKEDF I KLE LKS ATE VS FS FDVGNG P VE I WR 
SPTPLNDDQWHRVTAERNVKQASLQVDRLPQQIRKAPTEGHTRL 
ELYSQLFVGGAGGQQGFLGC I RS LRMNGVTLDLEERAKVTSGFI 
S GC S GHCTS YGTNCENGGKCLER YHG YS CD CSNTAYDGT FCN KD 
VG A FFE EGMWLR YN FQAP ATNARDS S SR VDNAPDQQNSH P DLAQ 
EEIRFSFSTTKAPCILLYISSFTTDFLAVLVKPTGSLQIRYNLG 
GTREPYNIDVDHRNMANGQPHSVNITRHEKTIFLKLDHYPSVSY 
HLP S SSDTL FNS P KSLFLGKVI ETGKIDQB I HKYNT PG FTGCLS 
RVQFNQIAPLKAALRQTNASAHVHIQGELVESNCGASPLTLSPM 
S S ATDP WHLDHLDS ASADFP YNPGQGQAI RNGVNRNSAI I GGVI 
A\ WI FTPS LCTP \ VLP * SR * HVS PHKGTLP I PNEAKGAGSRQK 
KPGRRPSMNNDPPTSQRPIDESKKEWPHLRGGYLAMG 


6791 


1801 


1193 


" TGHEGAKGEKGDKGDLGPRGERGQHGPKGEKGYPGIPPEL/PGW 
SAW* SWLTAASTKVQAILLPQPLE * LGLQIAFMAbiiA lilt? b«u 
NSG 1 1 FSSVETNIGNFFDVMTGRFGAPVSGVYFFTFSMMKHEDV 
EE VY VYLMHNGNTVFS MYS YEN KG KS DTS S NHAVLKLAKGDE VW 
LRMGNGALHGDHQRFSTFAGFLLFETK 


6792 


33 


1073 


VRHTNWG VDM YLFS LGSESP KGAI GHIVSTEKTI LAVERNKVLL 
PPLWNRTFSWGFDDFSCCLGSYGSDKVLMTFENLAAWGRCLCAV 
CPSPTTIVTSGTSTVVCVWELSMTKGRPRGLRLRQALYGHTQAV 
T CLAAS VT F S LL VSG S QDCTC I LWDLDHLTHVTRLP AHREG I SA 
I T I SDVSGT I VS CAGAHLSLWNVNGQPLAS ITTAWGPEGAITCC 
CLMEG P AWDTS Q 1 1 1 TG S QDGMVR VW KT /VGCE D VCS WTAS RRG 
APGSASKPKRPQVGEEPGLESRAGR*HCFDREAQQNQP\PVTAL 
AVSRNHTKLLVGDERGRIFCWSADG*EERGSRGSGTTVPG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=91ycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan f Y=Tyrosine, X=Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=poseible nucleotide insertion) 


6793 


2340 


805 


GRKEANY \ YGSLTQAGTVS LGLDAEGQEVFVP FS AVLPMVAPND 
LVFDGWDI SSLNLAEAMRRAKVLDWGLQEQLWPHMEALRPRPSV 
Y I PE FI AANQS ARADNL I PGS RAQQLEQI RRD I RDFR S S AGLDK 
VIVLWTANTERFCEVIPGLNDTAEKTLLRTIELGLEVSPSTLFAV 
ASILEGCAFLNGSPQNTLVPGALELAWQHRVFVGGDDFKSGQTK 
VKSVLVDFLIGSGLKTMSIVSYNHLGNNDGSNLSAPLOFRSKEV 
S KSNWDDMVQSNPVLYTPGEEPDHC WI KYVP YVGDS KRALDE 
YTSELMLGGTNTLVLHNTCEDSLLAAPIMLDLALLTELCQRVSF 
CTDMDPEPQTFHPVLSLLSFLFKAPLVPPGSPWNALFRQRSCI 
ENILRACVGLPPQNHMLLEHKMERPGPSLKRVGPVAATYPMLNK 
KGP VP AATNGCTGDANGHLQE EP PM P TT * G PGHTVSRLFLP AAP 
HDPTLKAPTNKGRCHFSPPSTWGSWGL 


6794 


169 


1349 


DDVKRKPEASAH* EKPGPPSRPG VRGGRERAGGRGSHGARS CR \ 
E PAP P AP AP PEDHPDEEMGFTIDIKS FLKPGE KT YTQRCRL F VG 
NLPTDITEEDFKRLFERYGEPSEVFINRDRGFGFIRLESRTLAE 
IAKAELDGTI LKSRPLR IRFATHGAALTVKNLS PWSNELIjEQA 
FSQ FGP VEKAWWDDRGRATGKG FVEFAAKPPARKALERCGDG 
AFLLTTTPRPVIVEPMEQFDDEDGLPEKLMQKTQQYHKEREQPP 
RFAQPGTFEFEYASRWKALDEMEKQQREQVDRNIREAKEKLEAE 
MFAARHEHQLMLMRQDLMRRQEELRRLEELRNQELQKRKQIQLR 
HEEEHRRREEEMIRHREQEELRRQQEGFKPNYMENYVCHFLR 


6795 


1740 


1010 


GPRRQTQVRDHELDSF* DWAAQETDCAQNSGERL* KGV/ LENFS 
TMS KSAVKIS LDLLSNPLCEQDQDLLNMVTALDTAMKRMDAFNQ 
E KVNQ I Q KT V I E PL KKFGS VFPS LNMAVKRR EQ ALQD YRRLQ AK 
VEKYEEKEKTGPVLAKLHQAREELRPVREDFEAKNRQLLEEMPR 
FYGSRLDYFQPSFESIilRAQWYYSEMHKIFGDLSHQLDQPGHS 
DEQR ERENEAKLS ELRALS IVADD 


6796 


48 


683 


GKE IQI PTIKLAWLLFGLE* PVGALGKGWSF* * SHVALGQLGW 
LTRAVRSSWRWELCVSAQEWSQRSA*SSPSPVGACPSLNPPET 
SVQEGRDCWQR*LPRLFSALVGQPGCWPQGAPPERCV*PGRCKW 
HLQSQVLR*ERRRCCRCLPRFA*GWRRRHQRLGLGIHPAPLGST 
SPPHPEGNSQQCRR*GWAAELRLPSSWL*GKLGC* 


6797 


1620 


211 


TERMTPSQPTRGSSCTRPSSMLWTSTWRCLTCHWAGMRMSWGV 
TLGPMAQGLLSASGTTTEATWTRPTTHLTLIRWWLLTASRVDPP 
ERPPPPPSDDLTLLESSSSYKNL/DAQIPQ/DWSMSPSTSG*RP 
LTSRASS IMRSRTAI PSAS * SRLTTKHTVGGSPSAWRPRPTSRS 
VSTP VS S STETTASGS CLTWWS S SPAPCPS SSAPAHS FEAS CCK 
TSLWGSCGGSGDGSSACGSGWNLSMAGTSCSSPAMCSPSRAPS* 
RSASRPRTWRATTSAASSWAPRRCWCGWA*SAT* PSSTTTI SSS 
PHCG WPCP AS CAS AAAWLS STWATAS VAGSCWGPIM* S SAHS PW 
CLSACSRSSMGTTCL*RSPP\SGASRAAAAWCX3SSPSSTFTPSS 
ASSSTWCSASSSRSS PAPTTPSS I PAAQAQRRASCRPTSHSART 
APPPAS S AAGAARPAAFSAAAEGTPRRS I RCW 


6798 


3894 


1696 


STISWESLESWLNKATNPSNRQEDWEYIIGFCDQINKEIjEG*VS 
ALWGQLRGSGLGRGTTMAKEGQPGSPRLSALECVLLVPQ\PQIA 
VRLLAHKIQS PQE WEALQALT YLGDRVS E KVKTKV I ELLYSWTM 
ALPEEAKI KDAYHMLKRQG I VQSDPP I PVDRTLI P S P P PRPKNP 
VFDDEEKS KLLAKLLKS KNPDDLQEANKL I KSMVREDEAR I QKV 
TKRLHTLEEVNNNVRLLSEMLLHYSQEDSSDGDRELMKELFDQC 
ENKRRTLFKLASETEDNDNSLGDILQASDNLSRVINSYKTIIEG 
QVINGEVATLTLPDSEGNSQCSNCCTLIDLAELDTTWSIjSSVLA 
PAPTPPSSGIPILPPPPQASGPPRSRSSSQAEATLGPSSTSNAL 
SWLDEELLCLGLADPAPNVPPKESAGNSQWHLLQREQSDLDFFS 
PRPGTAACGASDAPLLQPSAPSSSSSC3APLPPPFPAPWPASVP 
APSAGSSLFSTGVAPALAPKVBPAVPGHHGLALGNSALHHLDAL 
DQLLEEAKVTSGLVKPTTS PLI PTTTPARPLL PFSTG PGS P LFQ 
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amino acid 
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amino acid 
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Amino acid segment containing signal pepti<£e 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E* 
Glutamic Acid # F« Phenylalanine, G*Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M«Methionine, N=Asparagine , 
P^Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine , V^Valine, 
W^Tryptophan, Y=Tyrosine, X«Unknown, *=«Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 








PLSFQSQGSPPKGPELSLASIHVPLESIKPSSALPVTAYDKNGF 
RILFHFAKECPPGRPDVLWWSMLNTAPLPVKSIVLQAAVPKS 
MKVKLQPPSGTELS PFS P IQPPAAITQVMLLANPLKEKVRLRYK 
LTFALGEQLSTEVGEVDQFPPVEQWGNL 


6799 


3894 


1696 


STISWESLESWLNKATNPSNRQEDWEYIIGFCDQINKELEG*VS 
ALWGQLRGSGLGRGTTMAKEGQ PGS PRLSALECVLLVPQ\ PQ IA 
VRLLAHKIQSPQEWEALQALTYLGDRVSEKVKTKVIELLYSWTM 
ALPEEAKI KDAYHMLKRQG I VQS DP P I P VDRTLI PS P P PRP KNP 
VFDDEEKSKLLAKLLKSKNPDDLQEANKLIKSMVREDEARIQKV 
TKRLHTLEEVNNNVRLLSEMLLHYSQEDSSDGDRELMKELFDQC 
ENKRRTLFKLASETEDNDNSLGDILQASDNLSRVINSYKTIIEG 
QVINGEVATLTLPDSEGNSQCSNQGTLIDLAELDTTNSLSSVLA 
PAPTPPSSGIPILPPPPQASGPPRSRSSSQAEATLGPSSTSNAL 
SWLDEELLCLGLADPAPNVPPKESAGNSQWHLLQREQSDLDFFS 
PRPGTAACGASDAPLliQPSAPSSSSSQAPLPPPFPAPWPASVP 
APSAGSSLFSTGVAPALAPKVE PAVPGHHGLALGNSALHHLDAL 
DQLLEEAKVTSGLVKPTTSPLIPTTTPARPLLPFSTGPGSPLFQ 
PLSFQSQGSPPKGPELSLASIHVPLESIKPSSALPVTAYDKNGF 
RI LFHFAKECPPGRPDVLVWVSMLNTAPLPVKS I VLQAAVPKS 
MKVKLQP PSGTELS PFSPIQP PAAI TQVMLLANPLKEKVRLRYK 
LTFALGEQLSTEVGEVDQFPPVEQWGNL 


6800 


404 


1646 


"RRSPSTGLSPVPQPSSPSLSDYSIPWSLLLSGTIAWATPGK*AG 
*PQAW*LGLAPAIAFI/GLTRGRKQNKEKMAEGGSGDVDDAGDC 
S G AR YNDW SDDDDDS NES KS I VW YP PW AR I GTEAGTRARARARA 
RATRARRAVQKRAS PNSDDTVLS PQELQKVLCLVEMSEKPYI LE 
AALIALGNNAAYAFNRDIIRDLGGLPIVAKILNTRDPIVKEKAL 
IVLNNLSVNAENQRRLKVYMNQVCDDTITSRLNSSVQLAGLRLL 
TNMTVTNE YQHMLANS ISDF FRLFSAGNEETKLQVLKLLLNLAE 
NPAMTRELLRAQVPSSLG\SLFNKKENKEVILKLLVIFENINDN 
FKWEENEPTQNQFGEGSLFF FLKE FQ VCADKVLG IBS HHDFL VK 
VKVGKFMAKLAEHMFPKSQE 


6801 


2 


1755 


SAEEFESQOASVTMriDVDAESFEVLVDVCyTGRVSLSEANVERL 
YAASDMLQLEYVREACASFLARRLDLTNCTAI LKFADAFGHRKL 
RS QAQS Y I AQN FKQL SHMGS I REETLADLTLAQLLAVLRLDS LD 
VESEQTVCHVAVQWXEAAPKERGPSAABWKCVRWMHFTEEDQD 
YLEGLLTKPIVKKYCLDVIEGALQMRYGDLLYKSLVPVPNSSSS 
/R*QQQLSCICSRKSTPETGYVCQGDGDLLWTPQRSLS\RYDPY 
S GD I YTM P S P LTS FAHTKTVTS SAVCVS PDHD I YLAAQPRKDLW 
VYKPAQNS WQQLADRLLCREGMDVAYLNG YI Y I LGGRDP I TGVK 
L KE VEC Y S VQRNQWALVAPVPHS FYS FEL I WQNYLYAVNS KRM 
LOTDPSHNMWLNCASLKRSDFQEACVFNDEIYCICDIPVMKVYN 
P ARG EWRR I SN I PLD S ETHNYQ I VNHDQKLLL I TSTT PQW KKNR 
VTVYEYDTREDQWIN IGTMLGLLQFDSGFI CLCARVYPSCLEPG 
QSFITEEDDARSESSTEWDLDGFSELDSESGSSSSFSDDEVWVQ 
VAPQRNAQDQQGSL 


6802 


157 


1341 


ETFPLFFFLLSKTPGKTASMAHFVQGTSRMIAAESSTEHKECAE 
PSTRKNLMNSLEQKIRCLEKQRKELLEVNQQWDQQFRSMKELYE 
RKVAELKTKLDAAERFLSTREKDPHQRQRKDDRQREDDRQRDLT 
RDRLQREBKEKERLNEELHELKEENKLLKGKNTLANKEKEHYEC 
E I KRLNKALQDALN I KCS FSEDCLRKSRVBFCHEEMRTEMEVLK 
QQVQIYEEDFKKERSDRERLNQEKEELQQINETSQSQLNRLNSQ 
IKACQMEKEKLEKQLKQMYCPPCNCGLVFHLQDPWVPTGPGAVQ 
KQREHPPDYQWYALDQLPPDVQHKAN/DWCLAPPPVCCQAG/PR 
TPGLK*SSCLWLPKC*NFRFILSKESPSVEVHTNRERQQATRER 
G 


6803 | 1 


2203 


" KLSGRPYRHMGVLGTSKLYDIRKTIFTFTPQFIDQQQFYLALDN 
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amino acid 
residue of 
amino acid 
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predicted end 
nucleotide 
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corresponding 
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residue of 
amino acid 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, Islsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T-Threonine, V-Valine, 
W-Tryptophan, Y*Tyrosine, X«Unk.nown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KMIVEMLRTDLSYLCSRWRMTGQPTITFPISHSMLDEDGTSLNS 
S I LAALRKMQDGYFGGARVQTGKLSEFLTTS CCTHLS FMDPGPE 
GKLYSEDYDDNYDYLESGNWMNDYDSTSHARCGDEVARYLDHLL 
AHTAPHPKLAPTSQKGGLDRFQAAVQTTCDLMSLVTKAKELHVQ 
NVHMYLPTKLFQASRPS FNLLDS PHPRQENQVPSVRVE IHLPRD 
Q SGE VD FKAL VLQLKETSSLQEQAD I LYM LYTMKG P DWNTEL YN 
ERSATVRELLTELYGKVGEIRHWGLIRYISGILRKKVEALDEAC 
TDLLSHQKHLTVGLPPEPREKTISAPIjPYEALTQLIDEASEGDM 
S I S I LTQE I MVYLAMYMRTQPGLFAEMFRLR IGLI I QVMATELA 
HSLRCSAEEATEGLMNLSPSAMKNLLHH I LSGKEFGVERK/SVR 
PTD S NVS P AI S IHE I G AVGATKTERTG I MQL KS E I KQ VE FRRLS 
I SAESQS PGTSMTPSSGSFPSAYDQQSSKDSRQGQWQRRRRLDG 
ALNRVPVGF YQKVWKVLQKCHGLSVEGF VLP S STTREMTPGE I K 
FS VHVE S \VLNVLLRPE YRQLLVEAI LVLTMLAD IE IHS IGS I I 
AVEKIVHI ANDLFLQEQKTLGP \DDTMLAKDPASG\ I CTLR\YD 
SAPSGRFGTMTYLS \RAA\ ATYVQE FLP \HS I CAMQ 


6804 


1 


951 


GSPGKKEEKAKNKESLCMENSSNSSSDEDEEETKAKMTPTKKYN 
GLEEECRKSLRTTGFYSGFSEVAEKRIKLLNNSDERLQNSRAKDR 
KDVWSSIQGQWPKKTLKELFSDSDTEAAASPPHPAPEEGVAEES 
LQT VAE E E S CS P S VE LEKPPP VNVD S KP I EE KT VEVNDRKAEF P 
SSGSNFSA*IPLPYLHLNRLHQSL*QKGSRQQSSVTVSEPLAPN 
QEEVRSIKSETDSTIEVDSVAGELQDLQSERE*LASRF*CQCEL 
KQ**SARTRTS*KSLYRSEKSERCSGRRKFIKKAEKKP*SNSGK 
QQKEGKRHK 


6805 


1539 


206 


RQPDLKYFGKSFDVSVSESSSLLSNDLPKFADGIKARNRNQNYL 
VPSPVLRILDHTAFSTEKSADIVICDEECDSPESVNQQTQEESP 
IEVHTAEDVPIAVEVHAISEDYDIETENNSSESLQDQTDEEPPA 
KLCKILDKSQAlrNVTAQQKWPLLRANSSGLYKCELCEFNSKYFS 
DLKQHMILKHKRTDSNVCRVCKESFSTNMLLIEHAKLHEEDPYI 
CKYCDYKTVIFENLSQHIADTHFSDHLYWCEQCDVQFSSSSELY 
LHFQEHSCDEQYLCQFCEHETNDPEDLHSHWNEHACfCLrELSD 
KYNNGEHGQYSLLSKITFDKCKNFFVCQVCGFRSRLHTNVNRHV 
AI EHTKI FPHVCDDCGKGFSSMLE \ I AKHLNSHliS EG I YLCQYW 
E YS TGQI EDL KIHLD F KHS ADL PHKCS DCLMRFGNERE L I SHIjP 
VHETT 


6806 


272 


3794 


VALCFPNSDPVMFMDAFYGCLiLAELGPVPIEVPLTRKDAGSQQV 
GFLLGS CG VFIiAL TTDACQKGL P KAQTGEVAAFKG WPPL S WL VI 
DGKHLAKP PKDWH PLAQDTGTGTAY IEYKTS KEGS TVG VTVSHA 
S LLAQCRALTQACGYS EAETLTNVLDFKREAGLWHGVLTS VMNR 
MHVVSVPYALMKANPLSWIQKVCFYKARAALVKSRDMHWSLLAQ 
RGQRDVS LS S LRMLI VADG ANP WS I S S CDAFLNVFQSRG LRP E V 
ICPCASSPEALTVAIRRPPDLGGPPPRKAVLSMNGLSYGVIRVD 
TEEKLSVLTVQDVGQVMPGANVCVVKLEGTPYLCKTDEVGEICV 
SS S ATGT A YYGLLG IT KNVFE AVP VTTGGAP I FDR P FTRTGLLG 
FIGPDHLVF I VG KLDGLM VTGVRRHN ADD WATALAVE PMKFVY 
RG R I AVFS VTVLHDDR I VLVAEQRP DAS EEDS FQWMS RVLQ AI D 
SIHQVGVYCLALVPANTLPKAPLGGIHISETKQRFLEGTLHPCN 
VLM C PHTCVTNL P KPRQKQ PE VG PASM I VGNLVAG KR I AQASGR 
EliAHLEDSDQARKFLFLADVLQWRAHTTPDHPLFLLLNAKGTVT 
S TATCVQLH KRAER VAAALMEKGRL S VG DHVALVY P PGVDL I AA 
FYGCLYCG^VPVTVRPPHPQNLGTTLPTVKMIVEVSKSACVLTT 
QAVTRLLRS KEAAAA VD I R T WPT I LDTDDI P KKK I AS VFR PP S P 
DVLAYLDFS VS TTG I LAG V KMS HAATS ALCRS I KLQCEL Y P SRQ 
IAICLDPYCGLGFALWCLCSVYSGHQSVtiVPPLELESNVSLWLS 
AVS Q Y KAR VTF CC Y S VME MCT KGLGAQTG VLRM KG VNLS CVRTC 
MWAEERP\RIALTQSFS KLFKDLGLPARAVSTTFGCRVNVAI C 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P^Proline, Q=Glut amine, R=Arginine, 
S=Serine ( T=Threonine, V-Valine, 
W-Tryptophan, Y-Tyrosine, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQGTAGPDPTTVYVDMRALRHDRVRIjVERGS phslplmesgki l 
PGVKVI I AHTETKGPLGDSHLGE I WVS S PHNATGY YTV YGEEAL 
HADHFSARLSFGDTQTIWARTGYLGFLRRTELTDASGGRHDALY 
WGSLDETLELRGMRYHPIDIETSVIRAHRS IAECAVFTWTNLL 
WWELDGLEQDALDLVALVTNWLEEHYLWGVWI VDPGVI P 
I NS RGEKQRMHLRDGFLADQLDP I YVAYNM 


6807 


1444 


606 


VGHDTVHAMFTCFPKCLGFSFPVNVTVSPRSEESHTTTVSGGNG 
SVFQAGPQLQAIiANLEARRGSIGAALSSRDVSGLPVYAQSGEPR 
RLTQ AQ VAAF PGENALEHS SDQDTWDSLRS PG FCS PLS S GGGAE 
SLPPGGPGHAEAGHLGKVCDFHLNHQQPSPTSVLPTEVAAPPLE 
KILSVDSVAVDCAYRTVPKPGPQPGPHGSLLTEGCLRSLSGDLN 

rfpcgmevhsgqreleswavgeama\lkfpmgamsyclrdrsr 

FLFRLPMGLSCPLQVQ 


6808 


2063 


737 


gvgsgaasalarsrplasrlssrrrtraprsgamqrlamdlrml 
srelslylehqvrvgffgsgvglslilgfsvayafyylssiakk 
pqlvtggesfsrflqdhcpwtetyyptvwcwegrgqtllrpf\ 
itskppvqyrneliktadggqisldwfdnpnstcyndastrpti 
lllpgltgtskesyilhmihlseelgyrcwfnnrgvagenllt 
prtyccantedletvihhvhslypsapflaagvsmggmlllnyl 
gkigsktplmaaatfsvgwntfacseslekplnwllfnyylttc 
lqssvnkhrhmfvkqvdmdhvmkaksirefdkrftsvmfgyqti 
ddyytdas ps prlks vg i pvlclns vddvfs pshai p i etakqn 

PNVALVLTSYGGHIGFLEGIWPRQSTYMDRVFKQFVQAMVEHGH 
ELS 


6809 


939 


65 


D YS GQ'TPVPTEHGMTL YTPAQTHPEQ PGS E AS TQP I AGTQTVPQ 
TDEAAQTDSQPLHPSDPTEKQQPKRLHVSNIPFRFRDPDLRQMF 
GQFGK I LDVE 1 1 FNERGSKGFG FVTFETSSDADRARE KLNGT I V 
EGRK I E VNNATARVMTNKKTGN P YTNGWKLNP WGAVYG P EF YA 
VTGFP YPTTGTAVAYRGAHIjRGRGRAVYNTFRAAP PP P P I PTYG 
AWYQDGFYGAE I \LEATQPTDTLS PLQRRQPTATVTAE S TQLP 
TRTITPSGPRRPTALEPCETFHRFLLGP 


6B10 


939 


65 


DYSGQTPVPTEHGMTLYTPAQTHPEQPGSEASTQPIAGTQTVFg 
TDEAAQTDSQPLHPSDPTEKQQPKRLHVSNIPFRFRDPDLiRQMF 
GQFGKILDVEIIFNERGSKGFGFVTFETSSDADRAREKLNGTIV 
EGRKI EVNNATARVMTNKKTGNP YTNGWKLNP WGAVYG P E FYA 
VTG F P Y P TTGTAVAYRGAHLRGRGRAVYNT FRAAP PPPPIPTYG 
AWYQDGFYGAE I \LEATQPTDTLS PLQRRQPTATVTAESTQLP 
TRTITPSGPRRPTALEPCETFHRFLLGP 


6811 


1522 


658 


dlvtvwsfvdcrviasthghVkswvswafdpyttsveegdpme 
fsgs dedfqdllhfgrdrads tqcrls rrns tdsrp vsvtyrfg 
svgqdtqlclwdltedilfphqplsrarthtnvmnatsppagsn 
gnsvttpgnsvppplprsnslphsavsnagskssvmdgaiasgv 

SKFATLSLHDRKERHHEKDHKRNHSMGHISSKSSDKLNLVTKTK 
TDP AKTLGT PLC PRMEDV PLLEP L I C KKI AH E RLTVL I FLEDC I 
VTACQEGFICTWGRPGKWS FNP 


6812 


4001 


l6&2 


EDAVFSLDLSTIIQGTWFLNGEELKSNEPEGQVEPGALRYRIEQ 
KGLQHRL I LHAVKHQDSGAL VG FS C PG VQDSAALT I QES P VH I L 
SPQDKVSLTFTTSERWLTCELSRVDFPATWYKDGQKVEESELL 
WKMDGRKHRLILPEAKVQDSGEFECRTEGVSAFFGVTVQDPPV 
HIVDPREHVFVHAITSECVMLACEV\DR\EDAPVRWYKDGQEVE 
ESDFWLENEGPH11RLVLPATQPSDGGEFQCVAGDECAYFTVTI 
TDVSSWIWPSGKVYVAAVRLERWLTCELCRPWAEVRWTKDGE 
EWES PALLLQKEDTVRRLVLPAVQLEDSGEYLCE IDDESAS FT 
VTVTE P PVRI I Y PRD EVTL IAVTLEC WLMCELSREDAPVRW YK 
DGLEVEESEALVLERDGPRCRLVLPAAQPSDGGEFVCDAGDDSA 
FFTVTVTEPPVQ FLALETTPS PLCVAPGE P WLS CELS RAGAP V 
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Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K*Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X«Unknown , *-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion} 








VWSHNGRPVQEGEGLELHAEGPRRVLCIQAAGPAHAGLYTCQSG 
AAPGAPSLS FTVQVAEP PVRWAPEAAQTRVRSTPGGDLELWH 
LSGPGGPVRWYmSERIASQGRVQLEQAGARQVLRVQGARSGDA 
GEYLCDAPQDSRIFLVSVEEPLLVKLVSDLTPLTVHEGDDATFR 
CEVS P P DADVT W LRNGAWT PG PQRQS CCS YGGCRMCGQRKART 
CVSKWRQAEWVQRGPCAGCEVGSPCPTTLACPWPRMGTSTASSS 
MVSYWPTRAPTAARATTIAPWPGSA 


con 
o a i -j 


9 


836 


SSTQQRPGVPAGPRPLDGYLGVADHKPLKMHCRDCALVTSSGHIi 
LHSRQGSQ IDQTECVI RMNDAPTRGYGRDVGNRTSLRVI AHSS I 
QRILRNRHDLLNVSQGTVFI FWGPSS YMRRDGKGQVYNNLHLLS 
QVLPRLKAFMITRHKMLQFDELFKQETGQ\NRKISNTWLSTGWF 
TMTIALELCDRINVYGMGPPDFCRDPNHPSVPYHYYEPFGPDEC 
TMYLSHERGRKGSHHRFITEKRVFKNWARTFNIHFFQPDWKPES 
LAINHPENKPVF 


6814 


3 


737 


KFRRQ EAN / ARERNRMHGLND ALDNLRKVVPC Y S KTQ KLS K 1 KT 
LRLAKNYIWALSEILRIGKRPDLLTFVQNLCKGLSQPTTNLVAG 
CLQLNARSFLMGQGGEAAHHTRS P YSTFYPPYHS PELTTPPGHG 
TLDNSKSMKP YNYCSAYES FYESTS PECASPQFEGPLS P P P INY 
NGIFSLKQEETLDYGKNYNYGMHYCAVPPRGPLGQGAMFRLPTD 
SHFPYDLHLRSQSLTMQDELNAVFHN 


6815 


906 


553 


QGLDPASQTKWELLKDGSGRRGDRRSSRDMAGGAGPRSESDLE 
DVGPTAEWNGDGSGSLRRSGSFGKLRDALRRSSEMLVKKLQGGT 
PQEPPNPRMKRASSLNFLNKSVEEPTQPGG 


6816 


1 


803 


NLLKTHKF\LLGQDEDSLHSVPVAQMGNYQEYLKTLASPLREID 
PDQPKRLHTFGNPFKQDKKGMMIDEADEFVAGPQNKVKRPGEPN 
SPMSSKRRRSMSLLLRKPQTPPTVTNHVGGKGPPSASWFPSYPN 
LI KPTLVHTDAT I I HDGHEEKMENGQ I TPDGFLSKS AP SELINM 
TGDLMPPNQVDSLSDDFTSLSKDGLIQKPGSNAFVGGAKNCSLS 
VDDQ KD P VASTLGAMPNTLQ I TP AMAQGINAD I KHQLMKE VRKF 
GRSK 


6817 


172 


3457 


LGMMDSPKIGNGLPVIGPGTDIGISSLHMVGYLGKNFDSAKVPS 
DEYCPACKEKGKLKALKTYRISFQESIFLCEDLQCIYPLGSKSL 
NNLISPDLEECHTPHKPQKRKSLESSYKDSLLLANSKKTRNYIA 
IDGGKVLNSKHNGEVYDETSSNLPDSSGQQNPIRTADSLERNEI 
LEADTVDMATTKDPATVD VS GTGR PS PQNEGCTS KIiEMPLES KC 
TS FPQALCVQWKNAYALCWLDCILS ALVHSEELKNTVTGLCS KE 
ESIFWRLLTKYNQANTLLYTSQLSGVKDGDCKKLTSEIFAEIET 
CLNEVRDEIFISLQPQLRCTLGDMESPVFAFPLLLKLETHIEKL 
FLYSFSWDFECSQCGHQYQNRHMKSLVTFTNVIPEWHPLNAAHF 
GPCNNC^SKSQIRKMVLEKVSPIFMLHFVEGLPQNDLQHYAFHF 
EGCL YQ I TSVI Q YRANNHF I TWILDADGS WLECDDLKGPCS ERH 
KKFEVP AS EIHIVIWERK1S Q VTD KE AACLP LKKTNDQKALSNE 
KPVSLTSCSVGDAASAETASVTHPKDISVAPRTLSQDTAVTHGD 
HLLSGPKGLVDNILPLTLEETIQKTASVSQLNSEAFL\LENKPV 
AENTGILKTNTLLSQESLMAbbVSAF^NK.Rjjiyijyr vuior ray 
VVNTNMQS VQLNTEDTVNTKS VNNTDATGLICX?VKS VE I E KDAQ 
LKQFLTPKTEQLKPERVTSQVSNLKICKETTADSQTTTSKSLQNQ 
SLKENQKKPFVGSWVKG1.ISRGASFMPLCVSAHNRNTITDLQPS 
VKGVNNFGGFKTKGINQKASHVSKKARKSASKPPPISKPPAGPP 
SSNGTAAHPHAHAASEVLEKSGSTSCGAQLNHSSYGNGISSANH 
EDLVEGQIHKLRLKliRKKLKAEKKKIiAALMSSPQSRTVRSENIjE 
QVPQDGS PNDCES IEDLLNELP YPIDIANESACTTVPGVSLYSS 
QTHEEILAELLSPTPVSTELSENGEGDFRYLGMGDSHIPPPVPS 
E FNE VSQNTHLRQDHN YCS P TKKNP CE VQ PDS LTNNACVRTLNL 
ES PMKTD I FDEFFSS SALNALANDTLDLPHFDE YLFENY 


6818 


2 


240 


rgTdKVIjWT/LSGAVK\CVQFSRISPDGEEGYPGEIjKVWVTYTL 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
p^Proline, Q=Glut amine, RsArginine, 
S=Serine, T*Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Un)cnown , *«Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








DGGE /LHS /ATTEHKP / VQATP VNLT \T I LTSTWQARLPQ I 


6819 


1 


961 


G I P CTEMGN FDNANVTG E I E F A I H YC FKTHS LE I C I KACKNLAY 
GEEKKKKOTPYVKTYLLPDRSSQGKRKTGVQRNTVDPTFQETLK 
YQVAPAQLVTRQLQVSVWHLGTLARRVFLGEVIIPLATWDFEDS 
TTQSFRWHPLRAKADKYEDSVPQSNGELTVRAKLVLPSRPRKLQ 
EAQEGTDQPSLHGQLCLWLGAKNLPVRPDGTLNSFVKGCLTLP 
DQQKLRLKS PVLRKQACPQWKHS FVFSGVTPAQLRQSSLELTVW 
DQALFGMNDRLLGGT \ RLGSKGDTAVGGDACSQSKLQWQKVLS S 
PNLWTDMTLVLH 


6B20 


1014 


340 


GDM VY I VGHVP PG F FEKTQNKAW FREGFNEKYLKWRKHHRVI A 
GQ FFGHHHTDS FRMLYDDAGVP I S AMF I TPGVTPWKTTLPGWN 
GANN P A I R VFE YDRATLS LiKDMVTYFMNLS QANAQGT PR WELE Y 
QLTEAYGVPDASAHSMHTVLDRIAGDQSTLQRYYVYNSVSYSAG 
VCDE ACSMQHVCAMRQVD I DAYTTCLYAS GTTPVPQLPLLLMAL 
LGLCT 


6821 


1088 


518 


EFDIYR/EVGGEFVPVTRDDSSNGFPRTQHGPSPTVHPIQSPQN 
RFCVTiTLDPETLPAIATTLIDVLFYSHSTPKEAASSSPEPSSIT 
F FAFS LI EG Y I \ S I VMD AETQ KKFPS DLLLTS S SGE LWRMVRIG 
GQPLGFDECGIVAQIAGPLAAADISAYYISTFNFDHALVPEDGI 
GSVIEVLQRRQEGLAS 


6822 


1088 


518 


EFDI YR/EVGGEFVPVTRDDSSNGFPRTQHGPS PTVHPIQS PQN 
RFCVLTLDPETLPAIATTLIDVLFYSHSTPKEAASSSPEPSSIT 
FFAFSLIEGYI\SIVMDAETQKKFPSDLLLTSSSGELWRMVRIG 
GQPLGFDECGIVAQIAGPLAAADISAYYISTFNFDHALVPEDGI 
GSVIEVLQRRQEGLAS 


6823 


654 


221 


PPKLLSRWARMGHGDEIV\LSDLNFPGLLHLPWGPWRSVQTAC 
GIPQLLEAVLKLLPLDTYVESPAAVMELVPSDKERGLQTPVWTE 
YESILRRAGCVRALAKIERFEFYERAKKAFAVVATGETALYGNL 
ILRKGVLALNPLL 


6824 


858 


104 


lllaqrwgwgVccffsiavsvkmnvllfapgllflij.tqfgfrg 

ALPKLGICAGLQVVLGLPFLLENPSGYLSRSFDLGRQFLFHWTV 
NWRFLPEALFLHRAFHLALLTAHLTLLLLFALCRWHRTGES ILS 
LLRDPSKRKVPPQPLTPNQIVSTLFTSNFIGICFSRSLHYQFYV 
W Y FHTL P YLLWAM P ARWLTHLLRLL VLG LIELS WNT Y P STS CSS 
AALHICHAVILLQLWLGPQPFPKSTQHSKKAH 


6825 


3 


1173 


SSGEFGLQASDIMWTISDTGWILIILCSLMEPWALGACTFVHLL 
PKFDPLVILKTLSSYPIKSMMGAPIVYRMLLQQDLSSYKFPHLQ 
NCLAGGESLLPETLENWRAQTGLDl RE F YGQTETGLTCMVS KTM 
KI KPG YMGTAAS CYDVQ 1 1 DDKGNVL P PGTEGD I G I R VKP I R P I 
GIFSGYVDNPDKTAANIRGDFWLLGDRG I KDEDGYFQFMGRADD 
IINSSGYRIGPSEVENALMEHPAWETAVISSPDPVRGEWKAF 
VILALQFLSHDPEQLTKELQQHVKS VTAP YKYPRKI EFVLNLPK 
TVTGKIQRA\KLRDKEWKMSGKAPCAVRHLRDIHLDSPLLSLSF 


6826 


2304 


954 


" LKTESFKPW/VNIALAFKLLGERASPNSFWQPYIQTLPREYDTP 
LYFEEDEVRYLQSTQAIHDVFSQYKNTARQYAYFYKVIQTHPHA 
NKLPLKDSFTYEDYRWAVSSVMTRQNQIPTEDGSRVTLALIPLW 
DMCNHTNGLITTGYNLEDDRCECVALQDFRAGEQIYIFYGTRSN 
AEFVIHSGFFFDNNSHDRVKIK3 J GVSKSDRLYAMKAEVLARAGI 
PTSSVFALHFTEPPISAQLLAFLRVFCMTEEELKEHLLGDSAID 
RIFTLGNSEFPVSWDNEVKLWTFLEDRASLLLKTYKTTIEEDKS 
VLKNHDLS VRAKMAI KLRLGEKEILEKAVKS AAVNRE YYRQQME 
EKAPLPKYEESNLGLLESSVGDSRLPLYLRNLEEEAGVQDALNI 
REAISKAKATENGLVKGENS IPNGTRSENESLNQBSKRAVEDAK 
GSSSDSTAGVKE 



558 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 
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to first 
amino acid 
residue of 
amino acid 
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Predicted end 
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corresponding 
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sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G*Glycine, 

TT Tt« «k 4 y)4 v\a T — T AITI^ ^ nP TT^T • "W Rl flP . 

fi»rtlfl C luirlC f J- = 1> SUicUt X J1C f IV.-* Ujr DlllC / 

L=Leucine, M=Methionine, N=Asparagine, 
P=Proline / Q=Glutamine , R«Arginine, 
S -Serine, T=Threonine, V:=Valine, 
VUTryptophan, Y«Tyrosine, X= Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6827 


1 


779 


SSVVEFGLSVLGGLFLljFVLENMIiGLLRHRGIjRPRCCRRKRRNL 
E TRNLD P ENG S GMALQ P LQAAPE PG AQGQRE KNSQH P P ALAP PG 
HQGHSHGHQGGTDITWMVLLGDGLHNLTDGliAIGAAFSDGFSSG 
LSTTLAVFCHELPHELGDFAMLLQSGLSFRRLLLLSLVSGALGL 
GGAVLGVGLSLGPVPLTPWVFGVTAGVFLYVALVDMLPALFPS S 
GAP AYA\ HVLLQG LGLLLGG CLMLAI TLLE ERLLP VTT EG 


6828 


3 


1654 


KSQHG/WILQLPffiSCKEGYVKDLKGNPGLHRAMLDLDNG I RPSE 
LGHLSQTASLKRGSS FQSGRDDTWRYKTPHRVAFVEKLTKLVLS 
QLPNFWKLWI S YVNGSLFSETAEKSGQIERS KNVRQRQNDFKKM 
IQEVMHSLVKLTRGALLPLSIRDGEAKQYGGWEVKCELSGQWLA 
HAIQTVRLTHESLTALEIPNDLLQTIQDLILDLRVRCVMATLQH 
TAEE I KRLAE KEDW I VDNEGLT S LP CQFEQC I VCSLQS LKGVLE 
CKPGEASVFQQPKTQEEVCQLS INIMQVFIYCLEQLSTKPDADI 
DTTHLS VDVS S PDLFGS IHEDFSLTSEQRLLIVLSNCCYLERHT 
FLN I AEHFE KHNFQG I E KI TQVS MAS LKELDQRLFENY I E LKAD 
P I VGSLE PG I YAGYFDWKDCLP PTGVRNYLKEALVNI IAVHAE V 
FT I S KELVPRVLS KV I E AVSEE LS RLMQCVS S FS KNGALQARLE 
ICALRDTVAVYLTPESKSSFKQALEALPQLSSGADKKLLEELLN 
KFKSSMHLQLTCFQAASSTMMKT 


6829 


1 


782 


MRMEAGEAAPPAGAGGRAAGGWGKWVRLNVGGTVFLTTRQTLCR 
EQKS FLSRLCQGEELQSDRDETGAYLlLJKDJf 1 x cijr 1 i-iw r likihj 
KLVLDKDMAEEG VLB EAE FYN I GPL I R I I KDRM EEKD YTVTQVP 
PKHVYRVLQCQEEELTQMVSTMSDGWRFEQLVNIGSSYNYGSED 
QAEFLCWSKEXjHSTPNGLSSESSRKTKSTEEQLEEQQQQEEEV 

eeveveqvqveadaqek/ccykpeapgceapdhlqglgvpi 


6B30 


1 


939 


mepgsvenlsivyrsrdflwnkhwdvridskawretltlqkql 
ryrf peladpdtcyg frfchqldfstsgalcvalnkaaagsayr 
cfkerrvtkaylallrghiqesrvtishaigrnstegrahtmci 
egsqgcenpkpsltdlwlehglyagdpvskvllkpltgrthql 

RV\HCS Al^HPVVGDLTYGEVSGREDRFrKMMiji^ ijjK. J-f i u i 
ECVEVCTPDPFLPSLDACWSPHTLLQSLDQLVQALRATPDPDPE 
DRGPRPGSPSALLPGPGRPPPPPTKPPETEAQRGPCLQWLSEWT 
LEPDS 


6831 


3 


1087 


SLFFGSSTPDNKVAEQEDLETQPSPSVEKAVTVIDPEGTIPTNF 
NVAEKPADHSLSEVKLKTADEPRGTLVKSGDGQNVKEKSMILSN 
VEDLQQPKFISEVSREDYGKKEISGDSEEMNINSWTSADGENL 
EIQSYSLIGEKLVMEEAKTIVPPHVTDSKRVQKPAIAPPSKWNI 
SIFKEEPRSDQKQKSLLSFPWDKVPQQPKSASSNFASKNITKE 
SEKPESIILPVEESKGSLIDFSEDRLKKEMQNPTSLKISEEETK 
LRSVS PTEKKI)1^EINR\S ytl\aekkvlaekqnsv\aplelrds 

_- T _, T _, v .„_ TrT . T ppTiPTUT vT?ciranaMDnHFVn"MFr)YNERPKIIVG 

NEIGKTQI IJbvsbKo lttijJULbl^>UJ>irnJrw nr lywCii'iPiOKr^ j. w vj 

SEKEKDEKKKK 


6B32 


1809 


412 


" MGSGLISGPPQDNSGEALKEPERAQEHSLPNFAGGQHFFEYLLV 
VSLKKKRSEDDYEPIITYQFPKRENLIiRGQQEEEERLLKAIPLF 
CFPDGNEWASLTEYPRETFSFVLTNVDGSRKIGYCRRLLPAGPG 
PRLPKVYCIISCIGCFGLFSKILDEVEKRHQISMAVIYPFMQGL 
REAAFPAPGKTVTLKSFI PDSGTEFI SLTRPLDSHLEHVDFSSL 
LHCLS FEQILQI FASAVLERKI IFLAEGLSTLSQCIHAAAALLY 
PFSWAHTYIPVVPESLLATVCCPTPFMVGVQMRFQQEVMDSPME 
EVLLVNLCEGTFLMSVGDEKDILPPKLQDDILDSLGQGINELKT 
AEQINEHVSGPFVQFFVKIVGHYASYIKREANGQGHFQERSFCK 
ALTSKTNRRFVKKFVKTQLFSLFIQEAEKSKNPPAGYFQQKILE 
YEEQKKQ/TETKGKNCEIRAWNKND 


6833 


1 


1129 


P LMT LS QCGG I P GHGHSHGGHGHGHGLP KGPRVKSTRPGSSDIN 
VAPGEQGPDQEETNTLVANTSNSNGLKLDPADPENPRSGDTVEV 
Q VNGN L VRE PDHMELEEDRAGQ LNM RGVF LHVLGDALG S V I VW 
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Predicted end 
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Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H«Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N^Asparagine, 
P*Proline, Q=Glutamine, R=Arginine, 
SsSerine, T=Threonine, V-Valine, 
W-Tryptophan, Y-Tyrosine, X -Unknown, *=*Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








NALVFYFSWKGCSEGDFCVNPCFPDPCKAFVEIINSTHASVYEA 
GPCV^LYLDPTLCVVMVCILLYTTYPLLKBSALILLQTVPKQID 
IRNLIKELRNVEGVEEVHELHVWQLAGSRIIATAHIKCEDPTSY 
MEVAKTIKDVFHNHGIHATTIQPEFASVGSKSSWPCELACRTQ 
CALKQCCG T L PQ AP S GKD AE KT P AVS I S CLE LSNNLB KKPRRTK 
AENIPA\WIEIKN\ IPNK\QPESSL 


6834 


78 


1151 


AGQERPAP I WRLLWLPTPS VS RKAEPAH I P INR* GA* E * KUGLP 
LCGSSASAYGWH*RLTPWSPGGS*HM*SSKAPVTQAREVLVAGP 
CS KLVLSGARG I VGTTVQ VLVEAQQP LLLLFTGVWGLNLRAGEE 
SRAL * LIE E VTQ VRD AHLGNAWG CAQCLS QGQ VGS ALAKALLE 
AAAAVRDCKEVLTVSGDKQQAEVSVRL*VRDVCVEEAGCVEFGQ 
AHGRPGLALAKGRGGTNEVEEQVQVDGVQKLVLSAHECHELVAG 
QQDGEDQAARTRLLQAGAHSVAHGRRQGQAPCRPHQEAGVSCHE 
LQQWGDAL*ARE*APQIIVLLLLEDVAQLRTGKKA*DLWDVE 

QLLROL 


6835 


1 


834 


GIPAADR\EASLELIKLDISRTFPNLCIFQQGGPYHDMLHSILG 
AYTCYRPD VGYVQGMS F IAAVLI LNLDTADAF I AFSNLLNKPCQ 
MAFFRVDHGLMLTYFAAFEVFFEENLPKLFAHFKKNNLTPDIYL 
IDWIFTLYSKSLPLDLACRIWDVFCRDGEEFLFRTALGILXLFE 
DILTKMDFIHMAQFLTRLPEDLPAEELFASIATIQMQSRNKKWA 
QVLTALQKDSREMREGKSVPPTLRLQREFALGTNQSPMPRPLCC 
FRLTPGQPRRTDAL 


6836 


1 


850 


MSCGRPPFDVDGMITLKV^DNLTYRTSPDSLRRVFblK^GkVGD^ 
YIPREP.HTKAPRGFAFVRFHDRRDAQDAEAAMDGAELDGRELRV 
QVARYGRRDLPRSRQGRRHAAGPEAA/RYGRRSRSYGRRSRSPR 
RRHRSRS RG P S CS RS RS RS R YRGS R YS RS P YSRS P YSRS RYS RS 
PYSRSRYRESRYGGSHYSSSGYSNSRYSRYHSSRSHSKSGSSTS 
SRSASTSKSSSARRSKSSSVSRSRSRSRSSSMTRSPPRVSKRKS 
KSRSRSKRPPKSPEEEGQMSS 


6837 


1 


1369 


TDGAAVAGNPGSDYFPGGTAP/GGPRTRRP\SGTSSSGSKASlii' 
PN P P AQGDGTS LS PNYTLES TS GNDG KPVSGGGGRGRGRRKRDS 
GHVS PGTFFDKYS AAPDSGGAPGVS PGQQQASGAAVGGS SAGET 
RGAPT P HEKALTS PS WGKGAELLLGDQ PDL I G S LDGGAKS DS S S 
PNVGEFASDEVSTSYANEDEVSSSSDNPQALVKASRSPLVTGSP 
KL P PRGVGAGE HG PKAP P PALGLG I MSNS TS TP DS YGGGGGPGH 
PGTPGLEQVRTPTSSSGAPPPDEIHPLEILQAQIQLQRQQFSIS 
EDQPLGLKGGKKGECAVGASGAQNGDSELGSCCSEAVKSAMSTI 
DLDSLMAEKSAAWYMPADKALVDSADDDKTLAPWEKAKPQNPNS 
KEAHDLPANKASASQPGSHLQCLSVHCTDDVGDAKARASVPTWR 
SLHSDISNRFGTFVAALT 


6838 


16 


499 


" LTDT P PPKTHM I HHS I S DYKATLRCWALGFY PME ITLTWQQDE K 
DQTRDMELVETRPAGDGTFQKWAAVWPSGEE/Q/RYMCHVQHE 
GLPEPLTLRWEQSSQPTIPIVGIVAGLVLLGAWTGAWSAVMC 
RKKNSDRVSYSEAASSDHAQGSDVSLTACKV 


" 6839 


1 


1195 


- ..^BPrrDnDPaT GiiTrprpuT.qnT.QMPQVKRLDALLSEPIPIHG 

AAPAtjWjFIJFl2iAL»oAr IrVjlvriJuSVjJUow v iu^uw/ujuu ut *• 

RGNFPTLS VQPRQ IRAGGPQHPGGAG \ IHVHRVRLHGSAASHVL 
HPES GLG YKDLD L VFRMDLR S EAS FQLTKAWLACLLDFL PAG V 
SRAKITPLTLKEAYVQKLVKVCTDSDRWSLISLSNKSGKNVELK 
FVDSVRRQFEFSIDSFQIILDSLLLFGQCSSTPMSEAFHPTVTG 
ESLYGDFTEALEHLRHRVIATRSPEEIRGGGLLKYCHLLVRGFR 
PRPSTDVRALQRYMCSRFFIDFPDLVEQRRTLERYLEAHFGGAD 
AARRYACLVTLHRWNESTVCLMNHERRQTLDL IAALALQALAE 
QGPAATAAIAWRPPGTDGWPATVNYYVTPVQPLLAHAYPTWLP 

CN 


6840 


4254 


2061 


" ' ELQGDFSVPDVPKSMAWCENSICVGFKRDYYLIRVDGKGSIlusi, 
FPTGKQLEPLVAPLADGKVAVGQDDLTVVLNEEGICTQKCALNW 
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location 
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residue of 
amino acid 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=L.eucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine , R~Arginine, 
S=Serine, T-Threonine, V*Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TDIPVAMEHQPPYIIAVLPRYVEIRTFEPRLLVQSIELQRPRFI 
TSGGSN 1 1 YVASNHFVWRL I PVPMATQ I QQLLQDKQ FELALQLA 
EMKDDSDSEKQQQIHHIKNLYAFNLFCQKRFDESMQVFAKLGTD 
PTHVMGLYPDLLPTDYRKQLQYPNPLPVLSGAELEKAHLALIDY 
LTQKRS QL VKKLNDS DHQSSTS PLMEGT PT I KS KKKhhQ I IDTT 
LLKCYLHTNVALVAPLLRLENNHCHIEESEHVLKKAHKYSELI I 
LYEKKGLHEKALQVLVDQSKKANS PLKGHERTVQ YLQHLGTENL 
HLI FS YSVWVLRDFPEDGLKI FTEDLPEVES LPRDRVLGFLIEN 
FKGIiAIPYLEHIIHVWEETGSRFHNCLIQLYCEKVQGLMKEYLL 
SFPAGKTPVPAGEEEGELGEYRQKLLMFLEISSYYDPGRLICDF 
P FDGLLEERAL LLGRMG KHEQ ALF I Y VH I LKDTRMAE E YCHKH Y 
DRNKDGNKDVYLSLLRMYLSPPSIHCLGPIKLELLEPKANLQAA 
LQ VLE LHHS KLDTT KAIiNLLP ANTQ IND I R I FLE KVLEENAQKK 
RFNQVLKNLLHAEFLRV\QEERILHQQVKCIITEEKVCMVCKKK 
IGNSAFARYPNGVWHYFCS\KEVNPADT 


6841 


1 


3206 


TPSTTGTKSNTPTSSVPSAAVTPLNESLQPLGDYGVGSKNSKRA 
RE KRD S RNM EVQ VTQBMRNVS I GMGS SDEWS D VQD 1 1 DS T PELD 
N5CPETRLDRTGS S PTQG I VNKAFG INTDSLYHELSTAGSE VIGD 
VDEGADLLGEFSGMGKEVGNLLLENSQLLETKNALNWKNDLIA 
KVDQLSGEQEVLRGELEAAKQAKVKLENRIKELEEELKRVKSEA 
1 1 ARREP KEE AE D VS S YLCTES D K I P MAQRRRFT RVEMAR VLME 
RNQYKERLMELQEAVRWTEMIRASREHPSVQEKKKSTIWQFFSR 
LFSSSSSPPPAKRPYPSGNIHYKSPTTAGFSQRRNHAMCPISAG 
SRPLEFFPDDDCTSSARREQKREQYRQVREHVRNDDGRLQACGW 
SLPAKYKQLS PNGGQEDTRMKNVPVPVYCRPLVEKDPTMKLWCA 
AGVNLSGWRPNEDDAGNGVKPAPGRDPLTCDREGDGEPKSAHTS 
PEKKKAKELPEMDATSSRVWILTSTLTTSKVVIIDANQPGTVVD 
QFTVCNAHVLCISSIPAASDSDYPPGEMFLDSDVNPEDPGADGV 
LAGITLVGCATRCNVPRSNCSSRGDTPVLDKGGGEVATIANGKV 
NPSQSTEEATEATEVPDPGPSEPETATLRPGPLTEHVFTDPAPT 
PSSGPQPGSENGPEPDSSSTRPEPEPSGDPTGAGSSAAPTMWLG 
AQNGWLYVHS AVANWKKCLHS I KLKDS VLSLVHVKGRVLVALAD 
GTLAIFHRGEIX^WDLSNYHLMDLGHPHHSIRCMAWYDRVWCG 
YKNKVHVIQPKTMQIEKSFDAHPRRESQVRQLAWIGDGVWVSIR 
LDS TLRL YHAHTHQHLQD VD I E P YVS KMLGTGKLG FS FVR I TAL 
L VAG SRL WVGTGNG WI S I PLTETWLHRGQ \ LLG \ LRANKTS P 
TSGEG\ARPGG\ I IHVYG\DDSSDRAARSFIP YCSMAQAQLCFH 
GHRDAVKFFVSVPGNVLATliNGSVLDSPAEGPGPAAPASEVEGQ 
KLRNVLVLSGGEGY I DFRI GDGEDDETEEGAGDMSQVKPVLS KA 
ERSHI I VWQVSYTPE 


6842 


3 


926 


RCQQLS AT I LTDHQ YLERTPLCAIL KQ KAPQQ YR I RAKLRS YKP 
RRLFQSVKLHCPKCKLLQEVPHEGDLDI I FQDGAT KT PDVKLQN 
TSLYDSKIWTTKNQKGRKVAVHFVKNNGILPLSNECLLLIEGGT 
LSEICKLSNKFNSVIPVRSGHEDLELLDLSAPFLIQGTVHHYGC 
KQWST * RS IQNLNSLVDKTSW I PSS VAEALGI VPLQYVFVMTFT 
LDDGTGVLEAYLMDSDKFFQIPASEVLMDDDLQKSVDMIMDMFC 
PPGIKIDAYPWLECFIKSYNVTNGTDNQICYQIFDTTVAEDVI 


6843 


2 


851 


NHRKVLSGAKRYECNECGKS FAYTSS LI KHRRIHTGERPYECS E 
CGRSFAENSSLIKHLRVHTGERPYECVECGKSFRRSSSLLQHQR 
VHTRERPYECSECGKSFSLRSNLIHHQRVHTGERHECGQCGKSF 
SRKSSLIIHLRVHTGERPYECSDCGKSFAEKSSLIKHLRVHTGE 
RPYECIDCGKSFRHSSSFRRHQRVHTGMRPYK*SKFWKFSCPGF 
LLLQGQRVHTGSRCYECDKWG I FFS *NAS F FT* KS APTEEVPFE 
CNECEKAFS PLSLVTTI FT 


6844 


244 


642 


" EHQLAGFELRKTQTSMSLGTTREKTDRVKSTAYLSPQELEDVFY 
QYDVKSEIYSFGIVLWEIATGDIPFQGCNSEKIRKLVAVKRQQE 
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Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine , V=Valine f 
WsTryptophan, Y=Tyrosine, X»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








PLGEDCPS ELREI I DECRAHDPS VRPS VDE ILKKLS T FS K* C I K 
I 


6845 


3 


1519 


VAVRDECYWRHVFWDQDLWMLLF I LMCHPETARARLE YR I RTLD 
G ALENAQNLG YQGAKFAWE S AD SGLE VC P ED I YG VQE VHVNGAV 
GLAFBLYYHTTQDLQLFREAGGWDWRAVAEFWCSRVEWSPREE 
KYHliRGVMS PDEYHSG VNNS VYTNVLVQNSLRFAAALAQDLGLP 
IPS QWLAVADKI KVP F DVEQNFH P E FDG YEPGE WKQ AD WLLG 
YPVPFSLSPDVRRKNLEIYEAVTSPQGPANTWSMFAVGWMELKD 
AVRARGLLDRSFANMAEPFKWfTENADGSGAVNFLTGMGGFLQA 
W FGCTG FR VTRAGVT FD P VCL SG I S RVS VSG I FYQGNKLNFS F 
S EDS VTVEVTARAGP WAPHLEAELWP SQS RLS LLPGHKVS FPRS 
AGRIQMSPPKLPGSSSSEFPGRTFSDVRDPLQSPLWVTLGSSSP 
TESLTVDPASE^SGTGASETSLGPSLWPRLHPPLLGTLLACHPS 
PAARLSGKVHAAWPEFKAFCL 


6846 


213 


1258 


LYFLKTIK*LNRLAEHP*YENEKLTKLRNTIMEQYTRTEESARG 
1 1 FTKTRQ S AYALSQ W I TENE KF AE VGVKAHHL I GAGHS S E FKP 
MTQNEQKEVISKFRTGKINLLIATTVAEEGLDIKECNIVIRYGL 
VTNE I AMVQARGRARADESTYVLVAHSGSGVI EHETVNDFREKM 
MYKAI HCVQNMKPEE YAHKI LELQMQS IMEKKMKTKRNIAKHYK 
NNPSLITFLCKNCSVLACSGEDIHVIEKMHHVNMTPEFKELYIV 
RENKTIjQKKCADYQINGE I ICKCGQAWGTMMVHKGLDLPCLKIR 
NFWVFKNNSTKKQYKKWVELPITFPNLDYSECCLFSDED 


6847 


1450 


348 


smcwnsdrlemplidialilyppsyvpytghlsddslsrkyclt 
wfedalngvl * raea i qphcvnagdrmekfrqkywnklqtlrqq 
p faygtltvr s lldtreh clne fn fpdp ys kvkqrengvalrcf 
pgwrsldalgweerqlalvkgiilagnvfdwgakavsavlesdp 
yfgfeeakrklqerpwlvdsysewlqrlkgpphkcalifadnsg 
i d 1 1 lgvf p fvrelllrgte vilacns gpaiind vths es l i vae 
riagmdpwhs alreerlllvqtgss s pcldls rldkglaalvr 

ERGADLWIEGMGRAVHTNYHAALRCESLKLAVIKNAWIiAERLG 
GRLFSVIFKYEVPAE 


6848 


19 


16 


AMWWN S LDG I RN I VLSN P KKRNTLS LAMLKS LQ SD I LHDADS ND 
LKVI I ISAEGPVFSSGHDLKELTEEQGRDYHAEVFQTCSKVMMH 
I RNHP VP V I AMVNGLATAAG CQLVAS CDIAVASDKS SFATPGVN 
VGLFCS TPG VALARAVPRKVALEMLFTGEP I SAQEALLHGLLNK 
WPEAELQEETMR I ARKI AS LSRP WS LGKATF YKQLPQDLGTA 
YYLTSQAMVDNLAIiRDGQEGI TAFLQKRKPVWSHEPV*VEH 


6849 


70 


821 


SLGVDGSCLEQGS PAPRPQTDTSP* PVGNWATQOEDLYHQS YEC 
VCVLFASVPDFKEFYSESNINHEGLE CLRLLNE I IADFDELLS K 
PKFS GVE KI KT I GS TYMAATGLNATS GQDAQQDAERS CS HLGTM 
VEFAVALGSKLDVI NKHS FNNFRLRVGLNHGP WAG VIGAQ KPQ 
YD I WGNTVNVAS RM E STG VLGKIQ VTEETAWALQS LGYTC Y S RG 
VIKVKGKGQLCTYFLNTDLTRTGPPSATLG 


6850 


2 


1235 


" ARG LNHE WT FEKLRQH I S RN AQ D KQ E LH L FMLS G V P DAV FDLTD 
LDVL RLE LIPEAKI P AK I S QMTNLQ E LHLCHC P AKVEQTAFS FL 
RDHLRCLHVKFTDVAE I PAW VYLLKNLREL YL I GNLNSENNKM I 
GLES LRELRHLKI LHVKSNLTKVPSN I TDVAPHLTKLVIHNDGT 
KLLVLNS LKKMMNVAELELQNCELER I PHAI FS LSNIjQELDLKS 
NNIRTIEEIISFQHLKRLTCLKLWHNKIVTIPPSITHVKNLESL 
YFSNNKLESLPVAVFSLQKLRCLDVSYNNISMIPIEIGLLQNLQ 
HLHITGNKVDILPKQLFKCIKLRTLNLGQNCITSLPEKVGQLSQ 
LTQLELKGNCLDRLPAQLGQCRMLKKSGLWEDHLFDTLPLEVK 
EALNQDINIPFANGI 


6851 


1765 


660 


VSAQVSAREGENCLGWN1ADSSQESYKSLEEAEDCYPPSLLTLD 
LRDLFNQVEQGPLLSCPKAGTDLSMGRAREVGWMAAGLMIGAGA 
CYCVYKLTIGRDDSEKLEEEGEEEWDDDQELDEEEPDIWFDFET 
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SEQ y 
ID b 
NO: n 
1 
c 

t 

a 
i 

c 


redicted p 
eg inning n 
ucleotide 1 
ocation o 
orresponding t 
o first a 
mino acid * 

nri nf C 

EBlUUc ua- 

imino acid £ 


redictea end A 
ucleotide ( 
ocation G 
orresponding H 
o first I 
mino acid E 
esidue of S 
reino acid * 
tequence | 


mino acid segment containing signal P*P^ de 
Alanine, ^Cysteine, D^Aspartic Acxd E- 
lutamic Acid, F-Phenylalanine, G=Glycme, 
=Histidine, I-Isoleucine , K=Lysine, 
^Leucine, M*Methionme , N=Asparagxnc , 
>*Proline, Q=Glutamine, R=Arginine, 
^Serine, ^Threonine, V=Valine, 
^Tryptophan, Y-Tyrosine, X=Unknown, '-Stop 
:odon, /=possible nucleotiae Qeietiun, 
LpnRRible nucleotide insertion) 


£ 


jequence 


t 


kkPVfrEPGDWfEPGAPGGTEDRPSGGGKAi^RAHPIKQRPFF^ 
^TWSAQNCKNGSCVLDLSKCLFIQGKLLFAEPKDAGFPFSQD 
4KNTWSAyiN^xuMv J ov.v napnNLNASIESQGQIKM 
INSHLASLSMARNTSPTPDPTVREALCAPDNLNAbi^^ux 

F p LI SEGSGCAKVQ VLKPLMGL SE KP VLAGEL.VG AQMLF S FMS h 
P I RNGNRE I LLETP AP -_ - 


6852 




407 




6853 


3 


469 


'355 CAVC IELY KPNLUj VRllJCNH I FHKTCVDt'WLljKHRTCPMC 
KCDILKRLGIEVDVEDGSVSLQVPVSNEIFNSASSHBBDNHSET 
S^T^PPLEEHVQSTNESKJLWHEANSVAVDVIPH 

vniCPTFEEDETPNQETAVREIKS . 


6854 


1148 


585 


SScEEFWHTIRyPNWKHISCKHAESVETEGNGEDLR 

SeeSleahgdyglrndyhmnlgqfleflkkhksehvfqi 

-^iitoCSPLSGANE.lAbUUli.KlKEVIiFTUOTUULkK 


6855 


1913 


1148 


Iptslfgrdsetkgesglvuegdkeihqifedldkkialrsrf 
^ipegcS^mvvaLdalhregivcrd^pnnillndrghi 

Q^™evedscdsda I= ^ 

GAVLFELLTGKTLVECHPAGINTHTTLNMPEWVSEEARblilUU^ 


6856 


1617 


' 997 


wrltlvkawnvdemayaqlvslgnpdfie^gvtycgesea 


6857 


1 


617 


^MAmMKTLPSATEDAKEEGLEAQISMAELIGRLESKALWF 

^SStnmLqlvrqemavcpeqlsefldslrqylrgt 

TCVRNCraiTAVRLSDGFTFVIYEFWETEEAWKRHLQSPLCKAF 
" ^SRGIKDFENDPPLSbCOiFQSRIAGDALmSUXRISSVFASPA 


6858 


2 


669 


lrctqtaSileelklekkikirvepgifewtkmeagkttptlm 

rrvj 

, . ^-.y, ^-rTo e-r-i^VTJT.QriT TOQPSSTGliLKSG 


6859 


1 


1150 


GETMFKKAKTKAKKKPRKRSUibl^ J r. LSbl TR 

KTNSVESLPEU.TSDSEGSYAGVGSPRDLQSPDFTTGFHSDKIE 

SQAPKIVRCSTHaTPGPEGNHISDLPLLDSPNPWLSSSVTAPSM 
VAP^F^IVBEE«3QEAALIRSREKPUVLIQIEEHAIQDLLVF 

YEAFGNPEEFVIVERTPQGPLAVPMWWKHGC 

^gROKKRGXFPKVAl'WTMRMiLFOHLTHPYPSEEQlCKU^ 


6860 


1889 


1515 


Slq^finarrijvqpmidqsnravsqgaayspegqp 

HGSFVLDGQQHMGIRfAGPMSGjjGjWI^MDGQW^M 


6861 


1889 


1515 


DTGLTILQVNNWFINARRI IVQPMIDQSNRAVSQGAAYSPBGQP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D-Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G«=Glycine, 
H-Histidine, I^Isoleucine, K-Lysine, 
L«Leucine, M=Methionine , N^Asparagine , 
P=Proline, Q*Glutamine, R=Arginine, 
S=Serine, T=Threonine ( V=Valine, 
WsTryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








MGS FVLDGQQHMG I RPAGPMSGMGMNMGMDGQWHYM 




2 


471 


EEIDREFHNKLKLKEDKLEKQEKPVNGEDKGDSGVDTQNSEGNA 
DEED PLGPNCYYDKTKS FFDN I S CDDNRERRPTWAEERRLNAET 
FG I P LRPNRGRGGY RGRGGLG FRGGRGRGGGRGGT FT AP RG FRG 
G FRGGRGGRE FAD FE YRKTTAFG P 


6B63 


2216 


487 


P QE PAL KS E FSQ VASNT I PLP L PQ PNTCKDNG P CKQVCS TVGG S 
AICSCFPGYAIMADGVSCEDQDECLMGAHDCSRRQFCVNTLGSF 
YCVNHTVLCADGYILNAHRKCVDINECOTDLHTCSRGEHCVNTL 
G S FHCYKALTCE PG YALKDGE CEDVDECAMGTHTOQPG FLCQNT 
KGSFYCQARQRCMDGFLQDPEGNCVDINECTSLSEPCRPGFSCI 
NTVGSYTCQRNPLICARGYHASDDGTKCVDVNECETGVHRCGEG 
QVCHNLPGSYRCDCKAGFQRDAFGRGCIDVNECWASPGRLCQHT 
CENTLGSYRCSCASGFLLAADGKRCEDVNECEAQRCSQECANIY 
GS YQCYCRQG YQIiAEDGHTCTD I DE CAQGAG I LCTFRCLNVPGS 
YQCACPEOGYTMTANGRSCKDVDECAK5THNCSEAETCHNIQGS 
FR CLRFE CPPNYVQ VS KTKCERTTCHD FLE CQNS PAR I THYQLN 
FQTGLLVPAHI FRIGPAPAFTGDT1ALNI IKGNEEGYFGTRRLN 
AYTG VVYLQRA VLE PRDFALDVEMKL WRQG S VTTFLAKMH I F FT 
TFAL 


6864 


2 


2933 


ladsspsnlqiiikeLlsmhhqpdpaltkefdylppvdsrsssg 
fvg lrnggatc ymn avfqql ymqpg l pe sll s vdddtdnpdd s v 
fyqvqslfghlmesklqyyvpenfwkifkmwnkelyvreqqday 
e f fts lidqmde ylkkmgrdq i fknt fqg i y s dq k i ckdc phry 
ereeafmalnlgvtscqsle i sldqfvrgevlegsnayycekck 
ekri tvkrtc i kslpsvlvi hlmrfgfd wesgrs i kydeqirfp 
wmlnmepytvsgmarqdsssevgengrsvdqggggsprkkvalt 
enyelvgvivhsgqahaghyysfikdrrgcgkgkwykfndtvie 
efdlndetleyecfggeyrpkvydqtnpytdvrrrywnaymlfy 
qrvsdqnsp vlp kksrvs wrqeaedlslsapss pe i spqs spr 
phrpnndrlsiltklvkkgekkglfvekmpariyqmvrdenlkf 
mknrd vyssd yfs f vlslas lnatkl khp y yp cmakvs lq l*a iq 

FLFQTYLRTKKKLRVDTEEW I ATI EAIjLS KS FDACQWLVE YF I S 
SEGRELIKIFLLECNVREVRVAVATILEKTLDSALFYQDKLKSL 
HQLLEVLLALLD KDVPENCKNCAQYFFLFNTFVQKQG I RAGDLL 
LRHS ALRHM I S F LLGASRQNNQ I RRWS S AQAREFGNLHNTVALL 
VLHSDVS SQRNVAPGI FKQRP P I S IAPS SPLLPLHEEVEALLFM 
SEGKPYLLEVMFALRELTGSLLALIEMWYCCFCNEHFSFTMLH 
F I KNQLETAP PHELKNTFQLLHE I LVI EDP IQVERVKFVFETEN 
GLIiALMHHSbraVDSSRCYQCVKFLVTLAQKCPAAXEYFKENSHH 
WSWAVQWLQKKMSEHYWTIiQSNVSNETSTGKTFQRTISAQDTLA 
YATALLNEKEQSGSSNGSESSPANENGDRHLQQGSESPMMIGEL 
RSDLDDVDP 


6865 


1820 


1242 


DPERWKHLSKVTPPGSSVSTTPVQWRLQSPQSQGSMMPSCNRS 
CSCSRGPSVEDGKWYGVRSYLHLFYEGYAVPPKLEGIGEGEFLV 
LDQRAADYNQALGTCRLAGTALCVAAGVLIiAI CLFWAM IGWLSQ 
DTKAEPLDPEADSHVEVFGDEPEQQLSPIFRNASGQSWFSPPAS 
PFGQSSVQTIQPKRDS 


6866 


1571 


495 


" D CPRPRYT LYGLRAT CMRDLD WAW I NAVS AF KALEQDLP VN I KF 
1 1 EGMEEAGSVALEELVEKEKDRFFSGVDYIVI SDNLWI SQRKP 
AITYGTRGNSYFMVEVKCRDQDFHSGTFGGILHEPMADLVALLG 
SLVDSSGHILVPGIYDEWPLTEEEINTYKAIHLDLEEYRNSSR 
VEKFL FDT KE E I LMHLWR YP S LS I HG I EGAFD E PGTKT V I PGR V 
IGKFSIRLVPHMNVSAVEKQVTRHLEDVFSKRNSSNKMVVSMTL 
G LH PW I AN I DDTQ YIiAAKRAI RTVFGTE PDM I R DGST I P I AKMF 
QE I VHKS WL I PLGAVDDGEHSQNEKINRWNY I EGTKLFAAFFL 
EMAQLH 
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SEQ I 
ID 1 
NO : l 


Predicted I 
beginning i 
lucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end i 
lucleotide 
Location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


t^mino acid segment containing signal peptide 
(A-Alanine, OCysteine, D=Aspartic Acid, E* 
Slutamic Acid, ^Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, Methionine, N=Asparagine, 
p=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X-Unknown, +«Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6867 


2833 


1704 


GTRIMSQPKQKELAGFVRQKMLLDYSVYMGRCVPQESRSPQRSP 
LQSAESSPTAGKKLPEVPPSEEEEQEAWVNALLGRIFWDFLGEK 
YWSDLVSKK I QMKJUSKI KLP YFMNE LTLTELDMGVAVPKI LQAF 
KPYVDHQGLWIDLEMSYNGSFLMTLETKMNLTKIiGKEPLVEALK 
VGEIGKEGCRPRAFCLADSDEESSSAGSSEEDDAPEPSGGDKQL 
LPGAEGYVGGHRTSKIMRFVDKITKSKYFQKATETEFIKKKIEE y 
VSNTP LLLTVE VQECRGTLAVN I P P P PTDRVWYG FRKP PHVELK 
ARPKLGEREVTLVHVTDW I EKKLEQEFQKVFVMPNMDDVYITIM 
HSAMDPRSTSCLLKDPPVEAADQP 


6868 


1 


346 


RPTRPPTRPEEIKNLILPY I S* DMN F V QDLCEDF YE .LF KTDKG FD 
KATFESQMSVMRGQILNLTQALRDGKSPFQLVQIPCVIVERSQG 
GSQGRIVHLSNSFTQTVNCRKPFFSSW 


ooby 


3 


1619 


MYMERMDKRALISFWESVEHLlQiANKNEIPQLVOElYgwyrvisa 
KE I S VE KS L YKE I QQCLVGNKG I E VFYKI QED VYE TLKDR Y Y PS 
FIVSDLYEKLLIKEEEKHASQMISNKDEMGPRDEAGEEAVDDGT 
NQ I NEQAS FAVNKLRELN E KLE YKRQALNS IQNAP KPD KK I VSK 
LKDEIILIEKERTDLQLHMARTDWWCENLGMWKAS ITSGEVTEE 
NGEQLPCYFVMVSLQEVGGVETKNWTVPKRLSEFHNLHRKLSEC 
VPSLKKDQLPSLSKLPFKSIDHTFMEKFENQLNKFLQNLLSDER 

LCQS E AL YAFL S P S PD YL KV I DVQG KKNS FSLS S F LERLP RD FF 
SHQEEETEEDSDLSDYGDDVDGRKDALAEPCFMLIGEIFELRGM . 
FKWVRRTLIALVQVTFGRTINKQIRDTVSWIFSEQMLVYYINIF 
RDAFWPNGKLAPPTTIRSKEQSQETKQRAQQKLLENIPDMLQSL 
VGQQNARHGIIKIFNALQETRANKHLLYALMELLLIELCPEIiRV 

HT,nQT.KAGQV 


6870 


1 


15^6 


MAAVVAATRWWQLLLVL S AAGMGAS GAPQ P PN I LLLLMD DMGWG 
D LG V YG EPS RET PNLDRMAAE GLLF PNFYS AN PLCS P S RAALLT 
GRLP I RNGFYTTNAHARNAYTPQEI VGG I PDS EQLLPELLKKAG 
YVSKIVGKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNKARP 
NIPVYRDWEMVGRYYEEFP INLKTGEANLTQI YLQEALDFI KRQ 
ARHHP FFLYWAVDATHAP VYAS KPFLGTSQRGR YGD AVRE I DDS 
IGKILELLQDLHVADNTFVFFTSDNGAALISAPEQGGSNGPFLC 
GKQTTFEGGMREPAIiAWWPGHVTAGQVSHQLGS IMDLFTTSLAL 
AGLTP P SDRAI DGLNLLPTLLQGRLMDRP I FYYRGDTLMAATLG 
QHKAHFWTWTNSWENFRQGI DFCPGQNVSGVTTHNLEDHTKLPL 
IFHLGRDPGERFPLSFASAEYQEALSRITSWQQHQEALVPAQP 
QLNVCNW AVMNWAP PGCE KLG KCLT P PES I P KKCLWS H 


6871 


209 


1126 


""RMSLNPPIFLKRSEENSSKFVETKQSQ'ITSIASEDPIiQNLCLAS 
QEVLQKAQQSGRSKCLKCGGSRMFYCYTCYVPVENVP I EQI PLV 
KLPLKIDIIKHPNETDGKSTAIHAKLLAPEFVNIYTYPCIPEYE 
EKDHEVALIFPGPQSISIKDISFHLQKRIQNNVRGKNDDPDKPS 
FKRKRTEEQEFCDLNDSKCKGTTLKKIIFIDSTWNQTNKIFTDE 
RLQGLLQ VEL KTRKTC FWRHQKG KPDT F LST I EAI YY FLVD YHT 
D I LKE KYRGQ YDNLLF F YS FMYQLI KNAKCSGDKETGKLTH 


6872 


880 


459 


- 1 mi t jvnnn ,qr ■ T FMKGNCVREDLI FNFLFKLGLDVRETNGLFGJMT 
KKLI TE VF VRQKYLE YRRI P YTE PAE YE FLWGPRAFLETSKMLV 
LRFLAKLHKKDPQSWPFHYLEALAECBWEDTDEDEPDTGDSAHG 

PTSRPPPR 


6873 


1929 


955 


DEQAVLCSKDKTYDLKIADTSNMLLFIPGCKTPDQLKXEDfaHUN 

iihteifgfsnnywelrrrrpklkklkkllmenpyegpdsqkek 

DSNSSKYTTEDLIiDQIQASEEEIMTQLQVLNACKIGGYWRILEF 
DYEMKLLNHVTQLVDSESWSFGKVPLNTCLQEU3PLEPEEMIEH 
CLKCYGKKYAn^EGEVYFELDADKICRAAARMLLQNAVKFNIiAEF 

qewmsvpsgmvtsldqlkglalvdrhsrpeiifllkvddlpe 

DNQERFNSLFSLREKWTEED IAPY IQDLCGEKQTIGAIjLTKYSH 
S SMQNGVKVYNSRRP I S 
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ID 1 

NO: 


Predicted 1 
Deginning 1 
nucleotide 
Location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end J 
lucleotide 
Location < 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Wno acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aepartic Acid, E= 
3lutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, Methionine, N=Asparagine, 
P=Proline, Q^Glutamine,. R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W-Tryptophan, Y -Tyrosine, X«UrJcnown, *-Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 


6874 


1 


307 


DS I ADHVNSAAVNVEEGTKNLGKAAKYKLAALPVAGALIOGMVU 
GPIGLIAGFKVAGIAAALGGGVLGFTGGKLTQRKKQKMMEKLTS 

SCPDLPSQTDKKCS 


(1875 " 


1688 


349 


VIGTGERGNSASEKWBIMFNEELGDPFIIIHSISLLNAEEHS1A 
TIiLL»R I E KB ELDMKGSGF YVSIiE WVT I S KKNQDN KKYE 1 1 KRD I 
LRGKSVPHYAAIEPDGNGLMIVSYKSLTFVQAGQDLEENMDEDI 
SEKI KEPL YYWQQTEDDLTVT IRLPEDNTKED IQ I QFLPDHIN I 
VLKDHQFLEGKL YS S I DHESSTW 1 1 KESNS LE I SLI KKNEGLTW 
PELVIGDKQGELIRDSAQCAAIAERLMHLTSEELNPNPDKEKPP 
CNAQE LEE CD IFFEESSS LCRFDGNTLKTTHWNLGSNQ YLFS V 
I VDPKEMP C FCLRHD VDAL.LWQ PH S S KQDDM WEH I AT FNALG YV 
QASKRDKKFFACAPNYSYAALCECLRRVF I YRQPAPMSTVLYNR 
KEGRQVGOVAKQQVASLETNDP I LGFQATNERLFVLTTKNLFLI 

tfVNTEN 


6876 


41 


1285 


VGEMTL I WRHLLRPLCL VT 5 APRI LEMHP F LS LGTS RTS VT KbS 
LHTKPRMPPCDFMPERYQVIFLVNSGSEANELAMLMARAHSNNI 
D 1 1 S FRGAYHGCS PYTLGLTNVG I YKMELPGGTGCQPTMCPDVF 
RGPWGGSHCRDSPVQTIRKCSCAPDCCQAKDQYIEQFKDTLSTS 
VAKS IAGFFAEP IQGVNGWQYPKGFLKEAFELVRARGGVC IAN 
EVQTGFGRLGSHFWGFQTHDVLPDIVTMAKGIGNGFPMAAVITT 
PE I AKS LAKCLQHFNTFGGNPMACAI GS AVLEV I KEENLQ ENSQ 
EVGT YMLLKFAKLRDEFE I VGDVRGKGLMIGI EMVQDKI S CRPL 
PREEVNQIHEDCKHMGLLVGRGS I FSQTFRI APSMC ITKPEVDF 
AVWFRRAT.TOHMERRAK 


6877 




778 


GTS PS PARAYAPPTERKRFYQN VS ITQGEGGFE INLDHRKLKTP 
QAKLFTVPSEALAIAVATEWDSQQDTIKYYTMHLTTLCNTSLDN 
PTQRNKDQL I RAAVKFLDTDTI CYRVE E PETLVE LQRNE WD P 1 1 
EWAEKRYGVE I SSSTS IMGPSI PAKTREVLVSHLAS YNTWALQG 
IEFVAAQLKSWVLTLGLIDLRLTVEQAVLLSRLEEEYQIQKWGN 
IEWAHD YELQELRARTAAGTLF IHLCSESTTVKHKLLKE 


6878 


1 331 


263 


QTL<^DFKNRAEMIDFNIRIKNVTRSDAGKYRCEVSAPSJi-U(JUW 
LEEDTVTLEVLVAPAVPSCEVPSSALSGTWELRCQDKEGNPAP 
E YTW FKDG I RLLENPRLG SQSTNS S Y TMNTKTGTLQ FNTVS KLD 
TGEYSCEARNSVGYRRCPGKRMQVDDLNISGI IAAWWALVIS 
VCGLGVC YAQRKGYFS KETS FQKSNS S S KATTMS ENDFKHTKSF 
TT 




3 


~ " 845 


" IRVIGESDIMQEFLSESDENYNGVSDVELRVALPDG'riVTVXVK 
KNSTTDQVYQAIAAKVGMDSTTVNYFALFEVISHSFVRKLAPNE 
FPHKLY I QNYTS AVPGTCLT IRKWLFTTEEE I LLNDNDLAVTY F 
FHQAVDDVKKGY I KAEEKS YQLQKL YEQRKMVMYLNMLRTCEGY 
NE I I F PHCACDS RRKGHV I TAI S I TH FKLHACTEEGQLENQV I A 
FEWDEMQRWDTDEEGMAFCFEYARGEKKPRWVKIFTPYFNYMHE 

CFERVFCELKWRKEEY 


6880 


2110 


1437 


- RKDNCT AKEWTFPEAKWNTTARVFSHIRLGMGHVLIIVQCFISS 
MANI YNEKILKEGNQLTES I FIQNS KLYFFGILFNGLTLGLQRS 
NRDQIKNCGFFYGHRAFSVALIFVTAFQGLSVAFILKFLDNMFH 

VLMAQVTT VI I TTVS VLVFD FRP S LE FFLEAPS VLLS I F I YNAS 
KPQVPEYAPRQERIRDLSGNLWERSSGDGEELERLTKPKSDESD 

KTYTF 


6881 


2638 


2244 


- NDSKWEDIHVITGALKMFFRELPEPLFTFNHFNUKVNAI KQEPK 
QRVAAVKDL IRQLPKPNQDTMQ I L FRHLRRV I ENGE KNRMT YQ S 
IAIVFGPTLLKPEKETGNIAVHTVYQNQIVELILLELSSIFGR 


6882 


1 


850 


GIPEAQLWIYPVKSCKGVPVSEAECTAMGLRSGNLRDRFWLVIN 
QBGKMVTARQBPRLVLISLTCDGDTLTLSAAYTKDLLLPI KTPT 
TNAVHKCRVHGLE I EGRDCGEATAQW I TS FLKSQP YRLVHFE PH 
MRPRRPHQIADLFRPKDQIAYSDTSPFLILSEASLADLNSRLEK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P^Proline, Q-Glutamine, R»Arginine, 
S=Serine, T=Threonine , V*Valine, 
W=Tryptophan, Y=Tyrosine, X»UnJcnown, *»Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KVKATNFRPNIVISGCDVYAEDSWDELLIGDVBLICRVMACSRCI 
LTTVDPDTG VMS RKE P LE TLKS YRQCD P SE R KL YGKS P LFGQ YF 
VLENPGTI KVGDPVYLLGQ 


6883 


2794 


2256 


NS KLKliNQN LKLF I TLTYQ VLS LHG WG PG I HLQ KEGAFP VTQNR 
ALQLLYDLRYLNIVLTAKGDBVKSGRSKPDSRIEKVTDHLEALI 
DP FDLDVFTPHLNSNLHRLVQRTS VLFGLVTGTENQLAPRS STF 
NSQEPHNILPLASSQIRFGLLPLSMTSTRKAKSTRNIETKAQYD 
ANC 


6884 


2 


99 


E FER VTAEAVKPRETS E PRAAAQR FCEKFPFL 


6885 


297 


1554 


STGQFWHVTDLHLDPTYHrTDDHTKVCASSKGANASNPGPFGDV 
LCDSPYQLILSAFDFIKNSGQEASFMIWTGDSPPHVPVPELSTD 
TVINVITNMTTTIQSLFPNLQVFPAIiGNHDYWPQDQLSWTSKV 
YNAVANLW KPWLDEE A I S TLRKGG F YS QKVTTNPNLR IIS LNTN 
L YYG PNI MTLNKTDP ANQ FE WL E S TLNNSQQNKE KVY 1 1 AH V P V 
G YLPS SON I TAMREYYNEKLIDI FQKYSDVI AGQFYGHTHRDS I 
KVLSDKKGSPWSLFVAPAVTPVKSVLEKOTNNPGIRLFQYDPR 
DYKLLDMLQYYLNLTEANLKGESIWKLEYILTQTYDIEDLQPES 
LYGLAKQFTI LDSKQF I KYYNYFFVS YDSSVTCDKTCKAFQ ICA 
IMNLDNI SYAD CLKQLY I KHNY 


6B86 


2 


1341 


QCGG I PGREGGSSRPLEEGTGS SPACVRGAAPGSEDAFY PTRAK 
QARVSQELKKAAKRTVS ISEGPDTLGDGMRERRETLAIiAPEPEP 
LE KEACEKWKRP FRSAS ATSLTLSHCVDWKGLLDFKKRRGHS I 
GGAPEQRYCI I PVCVAARLPTRAQDVliDAHLSEVNAVRFGPKSS 
LLATGGADRLIHLWNWGSRLEANQTLEGAGGSITSVDFDPSGY 
QVLAATYNQAAQLWKVGEAQS KETLS GHKDKVTAAKFKLTRHQ A 
VTGSRDRTVKE WDLGRA YCS RTINVLS YCNDWCGDHI IIS GHN 
DQKIRFWDSRGPHCTQVIPVQGRVTSLSLSHDQLHLLSCSRDNT 
LKVIDLRVSNIRQVFRADGFKCGSDWTKAVFSPDRSYALAGSCD 
GALY I WDVDTGKLESRLQGPHCAAVNAVAWCYSGSHMVSVDQGR 
KWLWQ 


6887 


1047 


116 


WTARPSQKPFWEAGAVPGDPLSTGCSQAQLGGCCPRGPWGPQHG 
GQQRAAGPTLPRGERGGPQQSGPGLAAQTPPTSKQVAWRAFLTG 
TYRS Q S PRS PAG P FRGGTG WW PE P AVCLCVAVG PQRLS S PGLVY 
NASGSEHCYDIYRLYHSCADPTGCX3TGPDARAWDYQACTEINLT 
FASNNVTDMFPDL P FTDELRQR YCLDTWGVW PR P DWLLTS FWGG 
DLRAASNIIFSNGNLDPWAGGGIRRNLSASVIAVTIQGGAKHLD 
LRASKPEDPASWEARKLEATI IGEWVKAARREQQPALRGGPRL 
SL 


6888 


1 


992 


FVAYVKKE I PHTWTHCLLNPHALVI KTLPTKLRDALFTWRVI 
NFI KGRAPNHRLFQAFFEE IG I EYSVLLFHTEMRWLSRGQILTH 
I FEMYEE INQFLHHKSSNLVDGFENKEFKIHLAYLADLFKHLNE 
LSASMQRTGMNTVSAREKLSAFVRKFPFWQKRIEKRNFTNFPFL 
E E 1 1 VSDNEG I FI AAE I TLHLQQ LSNF FHG YFS I GDLNEAS KW I 
LDPFLFNIDFVDDSYLMKNDLAELRASGQILMEFETMKLEDFWC 
AQFTAFPNLAKTALE ILMPFATTYLCELGFS ITFTFQNKVPEAA 
LILSDDIRVAISKKVPSFLGHH 


6889 


a 


1534 


LTLENQ I KEEREQDNSES PNGRTSPLVSQNNEQGSTLRDLLTTT 
AGKLRVGS TDAG I AFAP VYSMGAPSS KSGRTMPN I LDDI I AS W 
ENKIPPSKTSKINVKPELKEEPEESIISAVDENNKLYSDIPHSW 
ICEKHILWLKDYKNSSNWKLFKECWKQGQPAWSGVHKKMNISL 
WKAESISLDFGDHQADLLNCKDS I ISNANVKEFWDGFEEVSKRQ 
KNKSGETWLKLKDWPSGEDFKTMMPARYEDLLKSLPLPEYCNP 
EGKFNLAS HLPGFFVRPDLGPRLCS AYGWAAKDHD I GTTNLH I 
EVS DWN I LVYVG IAKGNG I LS KAG I LKKFEEEDLDD I LRKRLK 
DSSEIPGALWHIYAGKDVDKIRBFLQKISKBQGLEVLPEHDPIR 
DQSWYVNKKLRQRLLEEYGVRTWTLIQFLGDAIVLPAGALHQVQ 
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(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenylalanine, G=Glycine, 
H-Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Praline, Q*=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W ^Tryptophan, Y=Tyrosine, X=Unknown , *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








NFHSCI QVTEDFVS PEHLVES FHLTQELRLLKEE I NYDDKLQVK 
NILYHAVKEMVRALKIHEDEVDDMEEN 


6890 


3 


667 


THACGMW I PLYLHRALWHKTAETCNSPPCGAKDSLI FGAITCF 
TGFLGVDTGAGATRWCRLKTQRADPLVCAVGMLGSAIFI CLI FV 
AAKS S I VGA Y I C I FVGE T LLFSNW A I TAD I LM YW I PTRRAT AV 
ALQSFTSHLLGDAGSPYLIGFISDLIRQSTKDSPLWEFLSLGYA 
LMLCP FVWLGGMFFLATALF FVSDRARAEQQVNQ LAMP PAS VK 
V 


6891 


1980 


1262 


LRIHQELLSKELKLLRGITIESIIHIGLAAGKEQFMQDASNVMQ 
LLLKTQSHLYNMEDNNPEVRQAAAYGLGVMAQFGGDDYRSLCSE 
AVPLL VKVI KRAHS KT KKNVI AT EN C I S AI G KI LKFKPNCVNVD 
EVLPHWLSWLPLHEDKEEAIQTLSFLCDLIESNHPWIGPNNSN 
LPKI I SI IAEGKINETINYEDPCAKRLANVVRQVQTSEDLWLEC 
VSQLDDEQQEALQELLNFA 


6892 


3 


876 


RS VAAASG PGAWGTDHYCIiELLRKRD YEGYLCSLLLP AESRS S V 
FALRAFNVELAQVKDSVSEKTIGLMRMQFWKKTVEDIYCDNPPH 
QPVAIELWKAVKRHNLTKRWLMKIVDEREKNLDDKAYRNIKELE 
NYAENTQSSLLYLTLEILGIKDLHADHAASHIGKACGIVTCLRA 
TPYHGSRRKVFLPMD1CMLHGVSQEDFLRRNQDKNVRDVIYDIA 
SQAHLHLKHARS FHKTVP VKAFPAFLQTVSLEDFLKK IQRVDFD 
I FH P SLQQKNT LL PLYL Y IQ S WRKT Y 


6893 


1 


842 


"DGERKSMS VERTFS EINKAE EQYS L CQELCS ELAQDLQ KERLKG 
RTVTIKLKNVNFEVKTRASTVSSVVSTAEEIFAIAKELLKTEID 
ADF P H P LRLRLMG VR I S S F PNEEDR KHQQRS 1 1 G F LCAGNQALS 
ATECTliEKTDKDKFVKPLEMSHKKSFFDKKRSERKWSHQDTFKC 
EAVNKQS FQTSQP FQVLKKKMNENLE I SENS DDCQ I LTCPVCFR 
AQGCISLEALNKHVDECLDGPSISENFKMFSCSHVSATKVNKKE 
NVPAS S LCEKQD YEAH 


6894 


1742 


1463 


TTLCKPLVPREHQFYETLPAEMRKFTPQYKGKSQLLEGLPHWRG 
DVRDRGHGRPWQPSLEPSLPPTLCFPSLSSFSSSWPSAQHLTPS 
VFNPW 


6895 


2379 


478 


VTYVELCDLASPTALLIMRTVLDLIVEDLQSTSEDKEQQYTSQT 
TRLLALLYALASHKACKLAILHLINGTI KGDERYAE I FQDLLAL 
VRSPGDSVIRQQCVEYVTSILQSLCDQDIALILPSSSEGSISEL 
EQLSNSLPNKELMTS I CDCLLATLANSESS YNCLLTCVRTMMFL 
AEHD YGLFHLKS S LRKNSS ALHS LLKRWSTFS KDTGELAS S FL 
EFMRQILNSDTIGCCGDDNGLMEVEGAHTSRTMSINAAELKQLL 
Q S KEE S PENLFLE LE KL»VL EHS KDDDNLD S LL DS WG LKQMLE S 
SGDPLPLSDQDVEPVLSAPESLQNLFNNRTAYVLADVMDDQLKS 
MWFTPFQAEEIDTDLDLVKVDLIELSEKCCSDFDLHSELERSFL 
SEPSSPGRTKTTKGFKLGKHKHETFITSSGKSEYIEPAKRAHW 
P P PRGRGRGG FGQG I R PHD I FRQRKQNTS R PPSMHVDDFVAAES 
KEWPQDGIPPPKRPLKVSQKISSRGGFSGNRGGRGAFHSQNRF 
FTPPASKGNYSRREGTRGSSWSAQNTPRGNYNESRGGQSNFNRG 
P LP P LR PLS S TGYRPS PRDRAS RGRGGLG P S WAS ANS GSGGS RG 
KFVS GGSGRGRHVRS FTR 


6896 


1 


555 


GN I VIQKKKYNKQHI 1 PLENVT I DS I KDEGDLRNGVf LI KTPTKS 
FAVYAATATEKSE WMNH INKCVTDLLS KS GKTPSNEHAAVWVPD 
SEATVCMRCQKAKFTPVNRRHHCRKCGFWCGPCSEKRFLLPSQ 
SSKPVRICDFCYDLLSAGDMATCQPARSDSYSQSLKSPLNDMSD 
DDDDDDSSD 


6897 


3 


920 


" GDGLMHEWNGLMERPDWETAIQKPLCSLPAGSGNALAASLNHY 
AGYEQVTNEDLLTNCTLLLCRRLLS PMNLLS LHTASGLRLFSVL 
S LAWG FI ADVDLES E KYRRLGEMR FTLGT FLRLAALRT YRGR LA 
YLPVGRVGSKTPASPVVVQQGPVDAHLVPLEEPVPSHWTVVPDE 
DFVLVIjALLHSHLGSEMFAAPMGRCAAGVMHLFYVRAGVSRAML 
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P=Proline, Q^Glutamine , R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y-Tyrosine, X- Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LRLFLAMEKGRHMEYECPYLVYVPWAFRLEPKDGKGVFAVDGE 
LMVSEAVQGQVHPNYFWMVSGCVEPPPSWKPQQMPPPEEPL 


6898 


919 


346 


QKTVT AVAS LLKGRQG I YT ENE RRMGAV IKIRFFKIM LVL 1 1 CW 
LSNIINESIiL F YLEMQTD I NGGS LKP VRT AAKTT WF I MG I LNP A 
QGFLLSLAFYGWTGCSLGFQSPRKEIQWESLTTSAAEGAHPSPL 
MPHENPASGKVSQVGGQTSDEALSMLSEGSDASTIEIHTASESC 
NKNEGDPALPTHGDL 


6899 


120 


827 


MKVRKNNDAYLLDKNKINMDCF I S CF FKKMLTTLMFSHSGI LSL 
LEHGEEYTFSLPCAYARSIIiTVPWVELGGKVSVNCAKTGYSASI 
TFHTKPFYGGKLHRVTAEVKHNITNTVVCRVQGEWNSVLEFTYS 
NGETKYVDLTKLAVTKKRVRPLEKQDPFESRRLWKNVTDSLRES 
EIDKATEHKHTLEERQRTEERHRTETGTPWKTKYFIKEGDGWVY 
HKPLWKIIPTTQPAE 


6900 


3 


451 


TEVLGSKGIHELRSSTSALHHALEESASLLTMFWRAALPSTHIP 
VLPGKVGESTERELLEIiRTKVSQQEQLLQSTTEHLKNANQQKES 
MEQF I VS QLTRTHD VLKKARTNLE VRKLLHQS E AP S L.S PTHHHP 
LADLVGDSWPALRFQEK 


6901 


1 


201 


1 DDNMVQRLETDFKMTLQQQSTLEQWAAWLDNVMMQALKPYEGRP 
SFPKAARQFLLKWSFYRYHLGFS 


6902 


2 


267 


GAPPPPPSQPPRQPPQAAPSSHPHSDLTFNP S S ALEGQAGAQGA 
SDMPEPSLDLLPELTNPDELLSYLDPPDLPSNSNDDLLSLFENN 


6903 


1 


149 


RINQVYRQGPTGIHILVIDQMVQNFQDESCFLFSTVKAESSDGI 
HULK 


6904 


464 


2092 


MEASLPVSLSCVLACGDVEGKFDILFNRVQAIQKKSGNFDLLLC 
VGNF FGSTQDAEWEE YKTG I KKAP I QT YVLGANNQETVKYFQDA 
DGCELAENITYLGRKGIFTGSSGLQIVYbSGTESLNEPVPGYSF 
SPKDVSSLRMMLCTTSQFKGVDILLTSPWPKCVGNFGNSSGEVD 
T KKCG S AL VS S LATG LKPR YH F AALE KT YYERLP YRNHI I LQEN 
AQHAT R F I ALANVGN P EKKKYLYAFS I V P MKLMD AAE L VKQP P D 
VTENPYRKSGQEASIGKQILAPVEESACQFFFDLNEKQGRKRSS 
TGRDSKSSPHPKQPRKPPQPPGPCWFCLASPEVEKHLWNIGTH 
CYLALAKGGLSDDHVLILPIGHYQSVVELSAEVVEEVEKYKATIj 
RR FF KS RGKWCWF E RNYKSHHLQLQ V I P VP IS CSTTDD I KDAF 
ITQAQEQQ I ELLE I PEHSD I KQ IAQPGAAYFYVELDTGEKLFHR 

ikknfplqfgrevlaseailnvpdksdwrqcqiskedeetlarr 
frkd fe p yd ftldd 


6905 


1 


226 


vsktgeaetitshylfalgvyrtlylfnwiwryhfegffdliai 

VAGLVQTVLYCDFFYLYITKVLKGKKLSLPA 


6906 


3 


611 


' SYDDHNGHIDFITAASNLRAKMYSIEPADRFFCTKRIAGKIIPAI 
ATTTATVSGLVALEMIKVTGGYPFE1AYKNWFLNLAIPIVVFTET 
TEVRKTKIRNGIS FTI WDRWTVHGKEDFTIjLDFINAVKEKYGI E 
P TMWQG VKMLYVP VM PGHAKR LKLTMHKL VKPTTEKKYVDLT V 
S FAPD I DGDEDLPGP PVRY Y FSHDTD 


6907 


2 


2228 


- T D r'\roxnj& rpz^ttp crFFqTQHTjIMSRRSQRLTRYSQGDDDGS 
SSSGGSSVAGSQSTLFKDSPLRTLKRKSSNMKRLSPAPQLGPSS 
DAHTS YYS E S L VHES W F P PRS S LE ELHGDANWGEDLRVRRRRGT 
GGSESSRASGLVGRKATEDFLGSSSGYSSEDDYVGYSDVDQQSS 
S SRLRS AVS RAGSLLWMVATS PGRLFRLLYWWAGTTWYRLTTAA 
SLLDVFVLTRRFSSLKTFLWFLLPLLLLTCLTYGAWYFYPYGLQ 
TFHPALVSWWAAKDSRRADEGWEARDSSPHFQAEQRVMSRVHSL 
ERRLEALAAEFSSNWQKEAMRLERLELROGAPGQGGGGGLSHED 
TLALLEGLVSRREAALKEDFRRETAARIQEELSALRAEHQQDSE 
DLFKKI VRASQE SE AR I QQLKS EWQSMTQES FQESS VKELRRLE 
DQIAGl^QEIAALALKQSSVABEVGLLPQQIQAVRDDVESQFPA 
WISQFLARGGGGRVGLLQREEMQAQLRELESKILTKVAEMQGKS 
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(A^Alstnine, OCysteine, D-Aspartic Acid, E* 
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H^Histidine, I=Isoleucine, lULysine, 
L^beucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=»Serine, T*Threonine , V-Valine, 
W. Tryptophan, Y-Tyrosine, X=Unknown / *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








AREAAASLSLTLQKBGVIGVTEEQVHHIVKQALQRYSEDRIGLA 
DYALESGGASVISTRCSETYETKTALLSLFGlPLVfYHSQSPRVI 
LQPDVHPGNCWAFQGPQGFAWRLSARIRPTAVTLEHVPKALSP 
NSTISSAPKDFAIFGFDEDLQQEGTLLGKPTYDQDGEPIQTFHF 
QAPTMATYQWELRILTNWGHPEYTCIYRFRVHGEPAH 


6908 


3 


760 


QVPSAAWLMAVCGLGSRLGLGSRLGLQGCFGAARLLYPRFQSRG 
PQGVEDGDRPQPSSKTPRIPKIYTKTGDKGFSSTFTGERRPKDD 
QVFEAVGTTDELSSAIGFALELVTEKGHTFAEELQKIQCTLQDV 
GSALATPCSSAREAHLKYTTFKAGPILELEQWIDKYTSQLPPLT 
AF I LPSGGKI S SALHFCRAVCRRAERRWPLVQMGETDANVAKF 
LNRLSDYLFTLARYAAMKEGNQEKIYKKNDPSAESEGL 


6909 


3 


409 


GRLLAVGTDLYGQRSSAPEQELLVQDATPVSNSLLPEKAFSDIP 
SPY LRGTI KMMQ AVRQAFQDQDDRRTWDGRP LTMAATFDDC LYA 
LCWDTIKRSSQTGEWQNIAIMTEEPELSPAYLISEAMRRSRMS 

LYC 


6910 


1 


1068 


LVPWVIDSYYYGKLVIAPLNIVLYNIFTPHGPDLYGTEPWYFY 
L INGFLNFNVAFALALLVLPLTSLME YLLQRFHVQNLGHPYWLT 
LAPMY I WF 1 1 FF I QPHKEERFLF P VY P LI CLCGAVALS ALQHS F 
LYFQKCYHFVFQRYRLEHYTVTSNWLALGTVFLFGLLSFSRSVA 
LFRGYHGPLDLYPEFYRIATDPTIHTVPEGRPVNVCVGKEWYRF 
PSSFLLPDNWQLQFIPSEFRGQLPKPFAEGPLATRIVPTDMNDQ 
ktt T7TrpQRYTniqKCHYLVDLDTMRETPREPKYSSNKEEWISLAY 
R P FLDAS RS S KLLRAF YVP FLS DQ YTVYVNYT I LKPRKAKQ I RK 
KSGG 


6911 


1184 


966 


GEDAEEMETGNVANLIS I FGSSFSGLLRKSPGGGREEEEGEESG 
PEAAE PGQ I CCDKPVLRDMNPWSTAI VAF 


6912 


1 


844 


AMKP VETHSFQMLFTI LSTGSALKAQS YEDAYRCI KSS ILLGS I 
SGGTDI ISCFMGHNFSLPVYKGE IQARNLGMAVEAWNEEGKAVW 
GESGELVCTKPIPCQPTHFWNDENGNKYRKAYFSKFPGIWAHGD 
YCRINPKTGGIVMLGRSDGTLNPNGVRFGSSEIYNIVESFEEVE 
DSLCVPQYMCYREERVI LFLKMASGHAFQPDLVKR IRDAI RMGL 
SARHVP SL I LETKGI P YTLNGKKVEVAVKQ 1 1 AGKAVEQGGAFS 
NPBTLDLYRDIPELQGF 
KKSHEESHKEELSYGAQASLPLPCSDFR 


6913 
6914 


1643 
1251 


. 1558 
615 


ELAAECKSAGYPGTLIPYRCDLSNEEDILSMFSAIRSQHSGVDI 
CINNAGLARPDTLLSGSTSGWKDMFNVNVLALSICTREAYQSMK 
ERNVDDGHIININSMSGHRVLPLSVTHFYSATKYAVTALTEGLR 
QELREAQTHIRATCISPGWETQFAFKLHDKDPEKAAATYEQMK 
CLKPEDVAEAVIYVLSTPAHIQIGDIQMRPTEQVT 


6915 


254 


652 


GRSLSFKTFLIWVLISIYQGGILMYGALVLFESEFVHWAISFT 
ALILTEbLMVALTVRTWHWLMWAEFLSLGCYVSSLAFLNEYFD 
VAF I TTVT FLW KVS AITWS CLPL YVLKYLRRKLS P PS YCKLAS 


6916 


254 


652 


~ GRSLSFKlfFLlWVLISIYQGGILMYGALVLFESEFVHVVAISFT 
ALILTELLWALTVRTKHWLMWAEFLSLGCYVSSLAFLNEYFD 
VAFITTVTFLWKVSAITWSCLPLYVLKYIiRRKLSPPSYCKLAS 


6917 


254 


652 


" " GRSLSFKTFLIWVLISIYQGGILMYGALVLFTESEFVHVVAISFT 
ALILTELLMVALTVRTWHWLMWAEFLSLGCYVSSLAFLNEYFD 
VAF I TTVTFLW KVS AI T WS CLPLYVLK YLRRKLS P P S YC KLAS 


6918 


28 


921 


" PEAGTRSWREPDPEDLRRFLLSAACRSFPQWLPGGGGGQVSSCS 
DTDVP YLLLAVKSE PGRFAERQAVRETWGS PAPGI RLLFLLGS P 
VGEAGPDLDSLVAWESRRYSDLLLWDFLDVPFNQTLKDLLLLAW 
LGRH CPT VS F VLRAQDDAFVHT P ALLAHLRAL P PAS ARS L YLGE 
VFTQAMPLRKPGGPFYVPES FFEGGYPAYASGGGYVIAGRIAPW 
LLRAAARVAPFPFEDVYTGLCIRALGLVPQAHPGFLTAWPADRT 
ADHCAFRNLLLVRPLGPOAS IRLMKQLQDPRLQC 
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Am x no a.c i d segment. conu*iuxuy b^" 1 ** tr r — 
(A-Alanine, C-Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, F-Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q«Glutamine, R=Arginine, 
e CAvina T— Thr^nni VsValine. 
W^Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 


6935 1 


886 


543 


NSALYVAGGNDGTSCLNSVERYSPKAGAWESVAPMN I KKSTHUU 
VAMDGWLYAVGGNDGSSSLNS IEKYNPRTNXWVAASCMFTRRSS 

VG VAVLilSijLiWr rr roof ijjo vooxoaj 


6936 


1347 


567 


RSHRRQFLSRALLEFFGkoHPPPHRLFRkoLNVGLHYSHIFFbT 
TCLHFLRKRLQKGEVGLSVETSKPQVPVGGLSRKKVPQEPWATV 
MEKRLQEAQLYKEEGNQRYREGKYRDAVSRYHRALLQLRGLDPS 
LPS P LPNLG PQG PALT PE QEN I LHTTQTDCYNNLAACLLQME P V 
NYERVREYSQKVLERQPDNAKALYRAGVAFFHLQDYDQARHYLL 
AAVNRQPKDANVRRYLQuTQoiiJjooxnKiULi^wiJXi^J* 117 ^ 


6937 


1 


727 


AVEFRCCPGRDPACPARGWRLDRVYGTCFCDQACRFTGDCLWX 
DRACPARPCFVGEWSPWSGCADQCKPTTRVRRRSVQQEPQNGGA 
PCPPLEERAGCLEYSTPQGQDCGHTYVPAFITTSAFNKERTRQA 
TSPHWSTHTEDAGYCMEFKTESLTPHCALENRPLTRWMQYLREG 
YTVCVDCQPPAMNSVSLRCSGDGLDSDGNQTLHWQAIGNPRCQG 
TWKKVRRVDQCSCPAVHSFIFI 


6938 


3 


719 


N S RKLE IjAER VDTDFMU LiltAKKQS 5 br\.tsrl 1J o la i LjU i v v v v uxi 
EGNVAAAVSSGGLALKHPGRVGQAALYGCGCWAENTGAHNPYST 
AVSTSGCGEHLVRTILARECSHALQAEDAHQALLETMQNKFISS 
PFLASEDGVLGGVIVLRSCRCSAEPDSSQNKQTLLVEFLWSHTT 
ES M CVG YMS AQDG KAKTHI S RL P PG AVAGQS VAI EGGVCRIX3E P 
S ELTLQAECEASQRHFRT 


, 6939 


3 


810 


KVTAPRRPQRYSSGHGSDNoo V jjovjEiJUr r/wivjrK iftur ruiowi^u 
G YES LRRDS E ATGS AS S APDS MS E SG AAS PG ARTRS LKS P KKRA 
TGLQRRRLIPAPLPDTTALGRKPSLPGQWVDLPPPLAGSLKEPF 
E I KVYE I DD VERLQR P RPTPREAP TQGLACVSTRLRLAERRQQR 
LRE VQAKHKHLCEE LAETQGR LMLE PGR WLEQFE VDPE LE PES A 
E YLAALERATAALEQCVNLCKAHVMMVTCFD I SVAASAAI PGPQ 
EVDV 


6940 


1188 


496 


GKMAAQPLRHRSRCATPPRGDFCGGTERAIDQASFTTSMEWDTQ 

IJUOSVHLAWDLSRSU3AWFSRVTNNVVLEAPFLVGIEGSLKGS 
TYNLLFCGSCGIPVGFHLYSTHAALAALRGHFCLSSDKMVCYLL 
KTKAIVNASEMDIQNVPLSEKIAELKEKIVLTHNRLKSLMKILS 
EVTPDQSKPEN 


6941 


1 


713 


" ""SLSRADSDPHGPHTCGHVLNVI IGSNVIALAEAQRQAEAIiGYQA 
tnrr c t\ a Mnr nwc m& ott VOT ,T AHVARTRLT P SMAGAS VEEDAQL 
HELAAELQ I PDLQLEE ALETMAWGRGPVCLLAGGEPTVQLQGSG 
RGGRNQELALRVGAELRRWPLGP IDVLFLSGGTDGQDGPTEAAG 
AW VT PELASQAAAEGLD I AT FLAHNDSHT F FCCLQGGAHLLHTG 
MTGTNVMDTHLLFLRPR 


6942 


1 


246 


GDYVERYDPKTDTWTMGAPLSMPTNAVGGCLLGDRLYADGGYDG 
OTYLNTMESYBPQTNEWTQMASLNIGRAGACVVVIKQP 


6943 


-L 


739 


" PMATGDGAKTLAIHVKALTADSIRITWKATLPASSFRLSWl.RUi 
HS PAGGS ITETLVQGBKTEYLLTALE? KPTY 1 1 CMVTMETTNAY 
VADETPVCAKAETADSYGPTTTLNQEQNAGPMASLPLAGIIGGA 
VALVFLFLVLGA I CW YVHQ AGELLTRE RAYNRG S RKKDD YMESG 
TKKDNS ILE I RGPGLQMLP INPYRAKEE YWHTI FPSNGS SLCK 
ATHT I G YGTTRG YRDGG I PDI D YS YT 


6944 


960 ' 


156 


" VANILLNGVKYESELTGSSERAEQPIiSVGRLCSTICNMPKALRT 
LCVNHFLGWLS FEGMLLFYTDFMGE WFQGDPKAPHTS EA YQKY 
NSGVTMGCWGMCIYAFSAAFYSAILEKLEEFLSVRTLYFIAYLA 
FGLGTGLATLSRNLYWLSLCITYGILFSTLCTLPYSLLCDYYQ 
SKKFAGSSADGTRRGMGVDISLLSCQYFLAQILVSLVLGPLTSA 
VGSANGVMYFSSLVSFLGCLYSSLFVIYEIPPSDAADEEHRPLL 

LNV 
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- — §Eq- j Predicted Pre 
ID beginning nuc 
NO: nucleotide 1°< 
location co 
corresponding I to 
to first am 
amino acid re 
residue of an 
amino acid se 


mHEia - £5a - T ^ SI55 -rcid aB9 - rae nt confining «9 ,1 * Jko ^ B . \ 

Ti ndin9 SS3ST JSSSST^SS- 

•j P.proline, Q=Glutamine, R=Argtnine, 
ino acid T-Threonine, V.Valine. 

lino add * . ble nucleoUde dal.tion, 

!quence 


sequence 
6945 2067 


— ™ HS 

II 

R\ 
HI 
V 
A' 
C 
G 
L 
C 
I 

1 




6946 " " J | 


25bl 

1 


GISPVSQHMATWALYNLVSVYPDKYCPULIKEGGMPLLRDIIKM 


6947 2 

i 


4* "1682 ' 


S QSFPAPRSMRVASGGP^^^^^^^ spyMEyHpGGE 
^ lW SSSw LKSCLVGRMMKPAVUC 
««M*M«?225SKroSS SYPSTOWFQTDSLVTI 
DVREEE^GMl^KSQVTDT^ GLLISYTYW/R , A 
/EHIV*TEGYQFRLKNS SfflTL F ^ HpucNHNSL 
MRFRKIFL^/CESVGWBIV^ psTHLQWIGQ 

IPRroTG ™X^PvlGsSsEFKKPVLPNNKyiYFLIK 
^^^SSmVSSPEGHFKISKFQELEDLFLLA 

vSSeSisewngkoghispallseflkrnldk 




58 

4656 




" 6949 152 







574 



WO 01/53312 



PCT7US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C-Cyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=7yrosine , X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








SCSTDTSEVPRWPENKEDHLVYADEESSNITDGRITPEPAVSNT 

EEPSTTSTAST\YPDVLTRVSLYRSHLNFSMLESPALHCQPSTS 

SAFPIGSSGFSLVKEIKDSTSQHDDDNISTTSGFSSRASDKDIT 

VS KNTSLPPLWS PEAERSHSLSQHTATSSKKPAFNLS AFGTLS P 

SLGNSSILKTSQLGDSPFYPGKTTYGGAAAAVRQSKLRNTPYQA 

P VRRQMKAKQLSAQS YG VTS STARRI bQSLEKMSS PLADAKR IP 

SIVSS PLNS PIiDRSG IDITD FQAKREKVDSQYP P VQRLMTPKPV 

SIATNRSVYFKPSLTPSGEFRKTNQRIDKKCSTGYEKNMTPGQN 

REQRESGFSYPNFSLPAANGLSSGVGGGGGKMRRERHAFVASKP 

LEEEEMEGPVLPKISLPITSSSLPTFNFSSPEITTSSPSPINSS 

QALTNKVQMTS PSSTGS PMFKFSSP I VKSTEANVLPPS S IGFTF 

SVPVAKTAELSGSSSTLEP I ISSSAHHVTTVNSTNCKKTPPEDC 

EGPFRPAEILKEGSVLDILKSPGFASPKIDSVAAQPTATSPWY 

TR PA I S SFSSSG IGFGESLKAGSS WQGDTCLLQNKVTDNKC I AC 

QAAKLS PRDTAKQTG I ET PNKSG KTTLS ASGTGFGDKF KP V I GT 

WDCDTCLVQNKPEAIKCVACETPKPGTCVTGlALTLTvVSESAET 

MTASSSSCTVTTGTLGFGDKFKRPIGSWECSVCCVSNNAEDNKC 

VSCMSEKPGSSVPTSSSSTVPVSLPSGGSLGLEKFKKPEGIWDC 

ELCLVQNKADSTKCLACESAKPGTKSGFKGFDTSSSSSNSAASS 

SFKFGVSSSSSGPSQTLTSTGNFKFGDQGGFKIGVSSDSGYINP 

MSEGF*FSKHIVGFKFGVSSESKPEEVKKDSKNDNFKFGLSFGL 

SNPVFLT P FQFGVSNLGQEE KKEELL KS S CAG FRFGTGVINS TR 

VPANTIVTSENKSSFNLGTIETKSVSVAPLKCQTSEAKKEEMPA 

TKGGFSFGNVEPASLPSASVFVLGRTEEKQQEPVTSTSLVFGEG 

KLTMKE P KC \Q P V FS FGE FQRQTKDENS SKSTFSFSMTKPSEKE 

SEQPAKATFAFGAQTNTTADQGAAKPDLSYLNNSSSSSSTPATS 

AGGG\IFGSSTSSSNPPVATFVFGQSSNPGSSS\AFGNTAESST 

S OS LL FS QDS KLATTS STGTAVTPFVFG PGAS S NNTTTSG FG FG 

ATTTSSSAGSSFVFGTGPSAPSASPAFGANQTPTFGQSQGASQP 

NPPGFGSISSSTALFPTGSQPAPPTFGTVSSSSQPPVFGQQPSQ 

SAFGSGTTPNSSSAFQFGSSTTNFNFTNNSPSGVFTFGANSSTP 

AASAQPSGSGGFPFNQSPAAFTVGSNGKNVFSSSGTSFSGRKIK 

TAVRRRK 


6950 


2585 


411 


PRPGSRSGLCRRAGERGAVRAGGtSRRTRAE * IMDEL.HYQDTDS 
D VP EQRDS KCKVKWTHE EDEQLRALVRQFGQQDW KFLASH FPNR 
TDQQCQYRWLRVLNPDLVKGPWTKEEDQKVIELVKKYGTKQWTL 
I AKHLKGRLGKQCRERWHNHLNPE VKKS CWTEEEDR 1 1 CEAHKV 
LGNRWAEIAKMLPGRTDNAVKNHWNSTI KRKVDTGGFLSESKDC 
KPPVYLLLEUEDKDGLQSAQPTEGQGSLLTNWPSVPPTIKEEEN 
SEEELAAATTSKEQEPIGTDLDAVRTPEPLEEFPKREDQEGSPP 
ETSLPYKWVVEAANLLIPAVGSSLSEALDLIESDPDAMCDLSKF 
DLPEEPSAEDSINNSLVQLQASHQQQVLPPRQPSA\LVPSVTEY 
RLDGHTISDLSRSSRGELIPISPSTEVGGSGIGTPPSVLKRQRK 
RRVALS PVTENSTSLS FLDSCNS LTPKST PVKTLPFS PSQFLNF 

T.TVTvmiTMTiT wt onnoi TOTDWrcn W\A7T*TPT.WRr)KTPI_iHOKIiAAF 
WNKQD TLib 1-iCibr ai, lb 1 rVLoy^V vv l i jr.univuj\.J.r'i-ins^.i\x*™^* 

VTPDQKYSMDNTPHTPTPFKNALEKYGPLKPLPQTPHLEEDLKE 
VLRSEAGIELIIEDDIRPEKQKRKPGLRRSPIKKVRKSLALDIV 
DEDMKLMMSTLPKSLSLPTTAPSNSS SLTLSGI KEDNSLLNQGF 
LQAKPEKAAVAQKPRSHFTTPAPMSSAWKTVACGGTRDQLFMQE 
KARQLLGRLKPSHTSRTL I LS 


6951 


1940 


239 


AGPDDTMKRSLQALYCQLLSFLLILALTEALAFAIQEPSPRESL 
QVLPSGTPPGTMVTAPHSSTRHTSWMLTPNPDGPPSQAAAPMA 
TPTPRAEGHPPT\TPSPPSLRQ*PPPILKAP/SSTGPAPAAMAT 
TSSKPEGRPRGQAAPTILLTKPPGATSRPTTAPPRTTTRRPPRP 
PGSSRKGAGNSSRPVPPAPGGHSRSKEGQRGRNPSSTPLGQKRP 
LGKIFQIYKGNFTGSVEPEPSTLTPRTPLWGYSSSPQPQTVAAT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Ijeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W«Tryptophan, Y=Tyrosine, X=*Unknown, *=Stop 
Codon, /=poesible nucleotide deletion, 
\-poesible nucleotide insertion) 








TVPSNTSWAPTTTSLGPAKDKPGLRRAAQGGGSTFTSQGGTPDA 
TAASGAPVSP/PSCPSAFSAPPPR*PTGWPQP**LIAYCYP\CT 
SRPLSTSSGVFTAATGPTPAAFDTSVSAPSQGIPQGASTTPQAP 
THPSRVSESTISGAKEETVA\PSP * PTGCPVLSPQWYPQPQAI S 
STAWS P PG PGSLGQQGTS PMWPRGTNRSTEP P SA* ARW I S PG * S 
WPSACPSPP\LCPADGVLHEBEEEDRQPGEQPEAYGNNTHHPGT 
T FQQAC \ RGAAPG E I P VPLKPLRTQLSE PRS P ANG D YRDTGMVP 
C 


6952 


658 


304 


PESEGE SGEMTDRYT I HS QLEHLQS KY IGT \ AT PT P PSGSG\CE 
PTPRLVLLLHGPLRPSQLLRHCGE *EQS AS PLQLDGKDAS ALWT 
ASRQARGELRLCLTTAVRGTSPSVSPVCQSS 


6953 


1512 


349 


NWG KTRALASGKHVP FGKQTNPNKS / VHCDS *G* * RRETTQDES 
FSPHFRGKMGGW\KLEKELENTEQPVGGNEG*EHEVTGKLNSD 
PLLELCQCPLCQLDCGSREQLIAHVYQHTAAWSAKSYM\CPVC 
GRALS S PGS LGRHLL I HS EDQRSNCAVCG AR FT SHATFNS EKLP 
EVliNMESLPTVHNEGPSSAEGKDiAFSPPVYPAGILLVCNNCAA 
YRKLL EAQTP S VRKWALRRQNE P LE VRLQRLERERTAKKS RRDN 
ETPEEREVRRMRDREAKRLQRMQETDEQRARRLQRDREAMRLKR 
A I E T P E KRQARL I REREAKRLKRRLEKMDMMIjRAQFGQD P S AMA 
ALAAEMNFFQLP VSGVELD SQLLGKMAFEEQNS SSLH 


6954 


819 


1 


PPP PF 1 1 PSHPREAGT * AG * KRSGDSE CS PPVEQ * A* TRAAAQN 
* PQR*RWTEGNSPQASAVATPGQGASPAAPRCTP * PSRRHRRLP 
PGARPPAG * AAPAPTKPWLAGPASAPQPGAAPLS PPAPPLIRTR 
*CAGAAARGRPRRDRSPRPRTPGGCSWSEPRTPPAVSASAQTPS 
DAG*AGGR*GQRQRPSTGR*PPGVGGAGRSHRREGTIPGNPHPR 
AS * RAGWQR* PGP / REWGL*EPQGEEMSGPGGPGGAPPNQVGSS 
VMQAMSTGI 


6955 


1968 


782 


PPGRRQVRAQVAGAPVGHWGTRARQVKTGGRRRARRrMPFLGQD 
WRSPGWSWIKTEDGWKRCESCSQKLERENNHCNISHSIILNSED 
GEI FNNEEHE YAS KKRKKDHFRNDTNTQS FYREKWI YVHKE STK 
ERHGYCTLGEAFNRtiDFSSAIQDIRRFNYWKLLQLIAKSQLTS 
LSGVAQKNYFNILDKIVQKVLDDHHNPRLIKDLLQDLSSTJjCIL 
/N*RSREVCISGKHQYLDLPIRNYSRLATTATGSSDD*ASE\NG 
LTLSDLPLHMLNNILYRFSDGWDIITLGQVTPTLYMLSEDRQLW 
KKLC Q YH FAEKQFCRHL I LS E KGHIEWKLMYFALQ KH YPAKE Q Y 
GDTLHFCRHCSILFWKDSGHPCTAADPDSCFTPVSPQHFIDLFK 

F 


6956 


8605 


3839 


QTSTS I FASPTS P PVLGESVLQDNSFDLNNGSDAEQEEMETQSS 
DFPPSLTQPAPDQSSTIQLHPATSPAVSPTrSPAVSLWSPAAS 
PE I SPEVCPAASTWSPAVFS WS PAS S AVLPAVSLE VPLTAS V 
T S P KAS P VT SPAAAFP TAS P ANKD VS S FLETTADVEE I TGEGLT 
ASGSGDVMRRRIATPEEVRLPLQHGWRREVRIKKGSHRWQGETW 
YYGPCGKRMKQFPEVIKYLSRNWHSVRREHFSFSPRMPVGDFF 
EERDTPEGLQWVQLSAEEIPSRIQAITGKRGRPRNTEKARTKEV 
PK^^CRGRGRPPKVKITELLNKTDNRPLKKLEAQETIlNEEDKAKI 
AKS KKKMRQ KVQRGECQTT I QGQARNKRKQETKSLKQ KEAKKKS 
KAEKEKGKTKQEIOiKEKVKREKKEKVKMKEKEEVTKAJCPACKAD 
KTLATQRRLEERQRQQMILEEMKKPTEDMCLTDHQPLPDFSRVP 
GLTLPSGAFSDCLTIVEFLHSFGKVLGFDPAXDVPSLGVLQEGL 
LCQGDSLGEVQDLLVRLLKAALHDPGFPSYCQSLKILGEKVSEI 
PLTRDNVS E I LRCFLMA YGVE PALCDRLRTQ P FQAQP PQQKAA V 
LAFLVHELNGSTLIINEIDKTLESMSSYRKNKWIVEGRLRRLKT 
VLAKRTGRSEVEMEGPEECLGRRRSSRIMEVTSGMEEEEEEESI 
AAVPGRRGRRDGEVDATAS SIPELERQIEKLS KRQL FFRKKLLH 
SSQMLRAVSLGQDRYRRRYWVLPYLAGIFVEGTEGNLVPEEVIK 
KETDSLKVAAHASLNPALFSMKMELAGSNTTAS S PARARGRPRK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E== 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q«Glut amine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y*Tyrosine, X-Unknown, *=»Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








T KPG SMQPRH L KS P VRGQDS EQ P Q AQLQP EAQLHAP AQ P QPQLQ 
LQLQ SH KG FLEQEG S PLS LGQS QHDLS QS AFLSWLS QTQ SHS S L 
LSS S VLTPDSS PGKLDPAPSQP PEE PE PDEAESS PDPQALWFN I 
S AQMPCNAAPTPPPAVSEDQ PTPS PQQLASS KPMNRPSAANPCS 
PVQFSSTPLAGLAPKRRAGDPGEMPQSPTGLGQPKRRGRPPSKF 
FKQMEQRYLTQLTAQPVPPEMCSGWWWIRDPEMLDAMLKALHPR 
GIREKALHKHLNKHRDFLQEVCLRPSADPIFEPRQLPAFQEGIM 
SWSPKEKTYETDLAVLQWVEELEQRVIMSDLQIRGWTCPSPDST 
REDIAYCBHLSDSQEDITWRGRGREGLAPQRKTTNPLDLAVMRL 
AALEQNVERRYLRE PLWPTHEWLEKALLSTPNGAPEGTTTE I S 
YEI TPRIRVWRQTLERCRSAAOVCLCLGQLERS IAWEKSVNKVT 
CLVCRKGDNDEFLLLCDGCDRGCHIYCHRPKMEAVPEGDWFCTV 
CLAQQVEGEFTQKPGFPKRGQKRKSGYSLNFSEGDGRRRRVLLR 
GRES PAAGPRYS EEGLS PS KRRRLSMRNHHSDLTFCE 1 1 LMEME 
SHDAAWPFLEPVNPRLVSGYRRI I KNPMDFSTMRERLLRGGYTS 
SEEFAADALLVFDNCQTFNEDDSEVGKAGHIMRRFFE\SRWEEF 
YQGKQGQ S VRQGRWG VTLWHL P PT FQTKTCHFHLLML PW VQTQ V 
RYNPDF 


6957 


82 


3 514 


hLIVAMPEMKKEENBVPAPAPPPEBPSKEKEAGTTPAKDWTLV 
ETPPGEEQAKQNANSQLSILFIEKPQGGTVKVGEDITFIAKVKA 
EDLS EKPTING S R KWMDLASKAGKHLQLKBTF E RHS RVYTFEMQ 
IIKAKDNFAGNYRCEVTYKDKFDSCSFDLEVHESTGTTPNIDIR 
SAFKRSGEGQEDAGELDFSGLLKRREVKQQEEEPQVDVWELbKN 
TKPSEYEKIAFQYESPTCSGMLKRLKRSIREEKKSAAFAKILDP 
VYQVDKGGRVRFWELADPKLEVKWNKNGQELRPSTKYIFEDTR 
CQS I LNIDNCQMTDDSE YYVTAGDEKCSTELLVREPP IMVTKQL 
EDTTDYCGERVELECE VS EDDAQVKWFKNGEE I ILVQTRYRIRV 
EGKKH I LI I EGATKADAADYS VMTTGGQSS AKLS VDLKPLKI LT 
PLTDQTVNLGKE I CLKCE I SEN I PGKWTKNGLPVQESDRLKWH 
KGR IHKLVIDHALTEDBGDYVFAPDAYNVTLPAKVHVI DPPKII 
LDGLDADNTVTVIAGNKLRLEIPISGEPPPKAMWSRGDKAIMEG 
SGRIRTESYPDSSTLVIDIAERDDSGVYHINLKNEAGEAHASIK 
VKWD F PDP P VAPTVT E VGDDWC I MNWEPPAYDGGS PI LG Y FIE 
RKKKQSSRWMRLNFDLCKETTFEPKKMIEGVAYEVRIFAVNA\I 
G I S KP SMPSRP FVPLAVTS P PTLLT VD S VTDTTVTMRWRP PDH I 
GAAGLDGYVLEYCFEGSTSAKQSDENGEAAYDLPAEDWIVANKD 
LIDKTKFTITGLPTDAKIFVRVKAVNAAGASEPKYYSQPILVKE 
IIEPPKIHSPKHLKQTYIRRVGDRVILVIPFQGKPRPELTWKKD 
GAEIDKNQINIRNSETDTIIFIRKAERSHSGKYDLQVKVDKFVE 
TASID IRI IDRPGP PQIVKI ED VWGRNVALTWTPPKDDGNAAI T 
GYTIQKADKKSMEWLRVIEHIIEPVPHTELVIGNEYYFRVFSEN 
MCGLSEDATMTKESAVIARDGKIYKNPVYEDFDFSEAPMFTQPL 
VNRLCHSGYMATLNCSVRGNPKPKITWMKNKVAIVDDPRYRMFS 
NQGVCTLE I RKPS P YDGGT YCCKAVNDLGTVE I ECKLEVKVI AQ 


6958 


274 


1663 


o d t q x> \r ittf fi csnrc 3 &Mn f VKVD I E KEVTCP I CLE LLTE PLS L 
DCGHS FCQACITAKI KESVI ISRGESSCPVCQTRFQPGNLRPNR 
HLANI VERVKEVKMS PQEGQKRDVCEHHGKKLQ I FCKEDGKVI C 
WVCELSQEHQGHQTFRINEVVKECQEKLQVALQRLIKENQEAEK 
LEDDIRQERTAWKNYIQIBRQKILKGFNEMRVILDNEEQRELQK 
LEEGEVNVLDNLAAATDQLVQQRQDASTLISDLQRRLRGSSVEM 
LQDV I DVMKRSESWTLKKP KS VS KKLKS VFRVPDLSGMLQ VLKE 
LTDVQYYWVDVMLNPGSATSNVAISVDQRQVKTVRTCTFKNSNP 
CDFSAFGVFGCQYFSSGKYYWEVDVSGKIAWILGVHSKISSLNK 
RKSSGFAFDPSVNYSKVYSRYRPQYGYWVIGLQNTCEYNAFEDS 
SSSDPKVLTLFMAV\LPWLGFS 


6959 


1 


1469 


' S L VHWE FGRG I BD FP YLFFQLTHCQQR I CS VTQAG VQWCDHSS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Iscleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
p=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=*Valine, 
W= Tryptophan, Y=Tyrosine, X- Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








LQPQTPGLNQSSHLSLLSSRDYRMLSSFNEWFWQDRFWLPPNVT 
WTELEDRIXSRVYPHPQDLLAALPLALVIjIiAMRLAFERFIGLPLS 
RWLGVRDQTRRQVKPNATLE KHFLTEGHRP KEPQLS LliAAQCGL 
TLQQTQRWFRRRRNQDRPQLTKKFCEASWRFLFYLSSFVGGLSV 
LYHESWLWAP VMCWDRYPNQLTLS CPAADS EA\SLYWWYLLELG 
FYIiSLLIRLPFDVKRKGGGPSSIKPRPHYDPPSTA\DFKEQVIH 
HFVAVILMTFSYSANIiLRIGSLVLLLHDSSDYLLEACKMVNYMQ 
YQOVCDALFLI FSFVFFYTRLVLFPTQI LYTTYYES ISNRGPFF 
GYYFFNGLLMLLQLLHVFWSCLILRMLYSFMKKGQMEKDIRSDV 
EESDSSEEAAAAQEPLQLKNGTAGGPRPAPTDGPRSRVAGRLTN 

RHTTAT 


6 96 0 


387 


2068 


AKWAREltBMQEF \TRSFF VRGRPDLSTLTHS IVRRRYLAHSGRS 

HLEPEEKQALKRLVEEEPLKMQVDEAASREDKLDLTKKGKRPPT 

PCSDPERKRFRFNSESESGSEASSPDYFGPPAKNGVASRSHTHP 

KEENPRRA\SKAVEESSDEERQRDIiPAQRGEESSEEEEKGYKGK 

TRKKPWKKQAPGKAS VSRKQAREESEES EAE PVQRTAKKVEGN . 

KGTKSLKESEQESEEEIIiAQKKEQREEEVEEEEKEEDEEKGDWK 

PRTRSNGRRKSAREERSCKQKSQAKRLLGDSDSEEEQKEAASSG 

DDSGRDREPPVQRKSEDRTQLKGGKRXSGSSEDEEDSGKGEPTA 

KGSRKMARLGSTSGEESDLEREVSDSEAGGGPQGERKNRSSKKS 

SRKGRTRSSSSSSDGSPEAKGGKAGSGRRGEDHPAVMRLKRYIR 

ACGAHRNYKKLLGSCCSHKERLS ILRAELEALiGMKGTPSLGKCR 

ALKEQREEAAEVASLDVANIISGSGRPRRRTAWNPLGEAAPPGE 

LYRRTLDSDEERPRPAPPDWSHMRG1ISSDGESN 


$961 


340 


1646 


"RPWSSPTMKPNFSLRLRIFNLNCWGIPYLSKHRADRMRRLGDFJb 
NQESFDLALLEEVWSEQDFQYLRQKLSPTYPAAHHFRSG1IGSG 
LCVFSKHPIQELTQHIYTLNGYPYMIHHGDWFSGKAVGLIjVLHL 
SGMVLNAYVTHLHAE YNRQ KDI YLAHRVAQAWELAQF IHHTSKK 
ADWLLCGDLNMHPEDLGCCLLKEVTTGLHDAYLETRDFKGSEEG 
NTMVPKNCYVSQQELKPFPFGVRIDYVLYKAVSGFYISCKSFET 
TTGFD PHRGTP LS DHEALMATLFVRHS PP QQNP S STHGP \ AE RS 

pl/mcvclkealdgsixslgmaNqarwwaNtfaVsyviglglNll 
iju^lcvlaagggageaaillwtpsvglvlwagafylfhvqevng 

LYRAQAELQHVLGRAREAQDLGPEPQLYALL\LGQQEGDRTKEQ 


6962 


340 


1646 


R P W S S P TMKPN FS LRLR I FNLNCWG I P YLS KHRAD RMRRLGD FL 
NQESFDLALLEEVWSEQDFQYLRQKLSPTYPAAHHFRSGIIGSG 
LCVFSKHPIQELTQHIYTLNGYPYMIHHGDWFSGKAVGLLVLHL 
SGMVLNAYVTHLHAE YNRQKD I YLAHRVAQAWELAQF I HHTS KK 
ADWLLCX3DLNMHPEDLGCCLLKEWTGLHDAYLETRDFKGS E EG 
NTM VP KNCY VS QQELKP FP FGVR ID YVLYKAVSGFYI SCKS FET 
TTGFDPHRGTPLSDHEALMATLFVRHSPPQQNPS STHGP \AERS 
PL/MC^CLKEALDGSLGLGMA\QARWWA\TFA\SYVIGLGL\LL 
LALLCVLAAGGGAGEAAI LLWTPS VGLVLWAGAFYLFHVQE VNG 
LYRAQAELQHVLGRAREAQDLGPEPQLYALL\IjGQQEGDRTKEQ 


6963 


374 


2618 


" d^ttdt tt K7 t YY pktaFNOKASEENE ITQPGGSSAKPGLPCLNF 
EAVLSPDPALIHSTHSLTNSHAHTGSSDCDISCKGMTERIHSIN 

i^nfsnsvi^tlneqr^ghfcdvtvrihgsmlraqrcvlaags 
pffqdklllgysdieipswsvqsvqklidfmysgvlrvsqsea 
lqiltaasilqiktvidectrivsqnvgdvfpgiqdsgqdtprg 
tpesgtsgqssdtesgylqshpqhsvdriysalyacsmqngsge 
rsfysgawshhetalglprdhhmedpswitrihersqqmeryl 
sttpetthcrkqprpvriqtlvgnihikqemeddydyygqqrvq 
ilerneseectedtdqaegtesepkgesfdsgvsssigtepdsv 
eqqfgpgaardsqaeptqpeqaaeapaeggpqtnqletgasspe 
rsnevemdstvitvsnssdksvlqqpsvntsigqplpstqlylr 
qtetltsnlrmpltltsntqvigtagntylpalfttqpagsgpk 
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SEQ f 
ID k 

NO : t 
3 
c 
t 


redictecl * 
)eginning r 
lucleotide 
.ocation c 
:orresponding t 
:o first « 
amino acid J 
residue of « 
amino acid 
sequence 


redicted end F 
lucleotide 
.ocation C 
corresponding \ 
:o first I 
amino acid 
residue of i 
amino acid 1 
sequence 


imIHo acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, B- 
3lutamic Acid, F=Phenylalanine, G=Glycine, 
l=Histidine, I=Isoleucine, K^Lysine, 
J= Leucine, Methionine, N=Asparagine, 
?sProline, Q^Glutamme, K«Atgin , 
3-Serine, T=Threonine, V= Valine, 
^-Tryptophan, Y=Tyrosine, X-Unknown, '-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 

_ n ii l ,~ltij~t c T'TTT'^nT >p A POP! iAS S AGH 








PFLFSLPQPLAGQQTQFVIVbijPGLjS 1 r iAUijrAfyrwwfrtwn 
STASGQGEKKPYECTLCNKTFTAKQNYVKHMFVHTGEKPHQCSI 
CWRSFSLKDYLI K\HMVTHTGVRAYQCS I CNKRFTQKS S LNVHM 
RLHRGEKSYECY I CKKKFSHKTLLERHVALHSASNGTPPAGTPP 
r: a c ar; P pnWACTEGTTYVCS VC P AKFDQI EQ FNDHMRMHVSDG 


-6964 


1 


178 ' 


SGRP FFFFFSNTDVYF I KKVTNRWTAGSSYKMTRMKS IGKI LLL 


6965 


757 


208 


-NVFIEPRIQGFMKTSAHPGQKHPDFSMGLLFPiibAALEVCSLbS 
SGSLGYNLPQNH\GLLGRNTLVLLGQMRRISPFLCLKUKi>L»^Kf 
PQEKVEVSQLQKA\QAMSFLYDVLQQVFNFSHKALL\CCMEHDL 
PGPTPHFTSSAAGTPGDLLGAGDGRRRSMGQWVIEGSTLALRRY 

POESISTLE 


6966 


820 


1867 


IITALGVRGMPGCPCPGCXSMAGPRJjlatij'i'ALALELLGRACjObU^ 
ALRSRGTATACRLDNKESESWGALLSGERLDTWICSLLGSLMVG 
LSGWPLLVIPLEMGTMLRSEAGAWRLKQLLSFALGGLLGNVFL 
HLLPEAWAYTCSASPGGEGQSLQQQQQLGLWVIAGILTFLALEK 
/ HVPGQQGGGDQ PG PQQR PHCCCRRAQWRP LSGP AGCRARPRCR 
GP\DIKVSGYLNLLANTIDNFTHGLAVAASFLVSKKIGLLTTMA 
ILLHE I PHE VGDFAI LLRAGFDRWSAAKLQLSTALrGGLLiGAGFA 
TrTOSPKGVEETAAWVLPFTSGGFLYIALVNVLPDLLEEEDPW 


6967 


162 


633 


~GFXjPFKYWILDLSASSRMETDCNPMEL>SSMSGFEEGSELNGFEG 

TDM KDMRLEAE AWNDVLFAVNNM F VS KSLRCADD VAY INVETK 
ERNRYCLELTEAGLKWGYAFDQVDDHLQTPYHETVYSLLDTLN 

SPAYREAFGKR\LLQRLEALKRDGQS 


696B 


1 


2265 


rgggggrggpgarererpgepektmeaaagurgcfqphpglqkt 

LEGFHLSSMSSIiGGPAAFSARWAQEAYKKESAKEAGAAAVPAPV 

paatepppvlhlpaiqppppvlpgpffmpsdrstercetvlege 
tiscfwggekrlclpqilnsvlrdfslqqinavcdelhiycsr 

CTADQLE ILKVMG I LPFS AP SCGL 1 TKTDAERLCNAliLYGGAYP 

ppckkelaaslalglelsersvrvyhe\cfgkckgl\lvpelys 

S P S AACIQCLD\ CRLMY P PHKFWHSHKALENRT CHW<j * \vz>h\ 

nwrayillsqdytgkeeqarlgr\clddvkekfdygnkykrrvp 

R VS S E P PAS I RP KTDDTS S QS PAPS E KDKPS S WLRTLAG S S NKS 
LGCVHPRQRLSAFRPWSPAVSASEKELSPHLPALIRDSFYSYKS 

FE TAVAPNVALAP P AQQ KWS S P P CAAAVSRAP E PLATCTQPRK 
RKLTVDTPGAPETLAPVAAPEEDKDSEAEVEVESREEFTSSLSS 
LSS PSFTSSSSAKDLGS PGARALPSAVPDAAAPADAPSGLEAEL 
EHLRQALEGGLDTKEAKEKFLHE WKMRVKQEE KLSAALQAKRS 
LHQELEFLRVAKKEKLREATF^KRNLRKEIERliRAENEKKMKEA 

NE S RLRLKRE LEQ ARQAR VCDKG CEAGRLRAKY S AQ I ED LQ VKL 
QHAEADREQLRADLLREREAREHIiEK\WK\ELQEQLWPRARPE 

*Ar,SEG\AAELEP 


6969 


1855 


118 


- AGTMH GRLKVKTSEEQAEAKRbEREQKLKLYOSATUAV^kKQA 
GELDESVLELTSQILGANPDFATLWNCRREVLQQLETQKSPEEL 
AALVKAELG FLE SCLR VNP KS YGTWHHRCWLLGRLP EPNWTREL 
ELCARFIJEVDERNFHCWDYRRFVATQAAVPPAEEIiAFTDSIilTR 
NFSNYSSWHYRSCLLPQLHPQPDSGPQGRLPEDVLLKELELVQN 
AFFTDPNDQSAWFYHRWLLGRADPQDALRCLHVSRDEACLTVSF 

S R P LL VGS RME I LLLMVDDS P L I VE W RTPDGRNR P SHVWLCDL P 
AASLNDQLPQHTFRVIWTAGDVQKECVLLKGRQEGWCRDSTTDE 
OLFRCELSVEKSTVLQSELESCKELQELEPENKWCb\LTIILLM 
RALDPLLYEKETLQYFQTLK\AWDPKRATY\LDDLRSKFLLENS 
VLKMEYAEWVUiLAHKDLTVbCHLEQLLLVTHLDLSHNRLRTL 
PPAIAALRCLEDPPPRT\VLQASDNAIESLDGVTNLPRLOELLL 
CNNRLQQPAVLQPIiASCPRLVLiLNLQGNPLCQAVGILEQLAELL 

PSVSSVLT _ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glyclne, 
H^Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V*Valine, 
W^Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6970 


3 


1528 


SFPPLLSSPSAVGEGKVAVAAPCPGRSECARAKMAYIQIjBPLNE 
GFLSRISGLLLCRWTCRHCCQKCYESSCCQSSEDEVEILGPFPA 
QTPPWLMASRSSDKDGDSVHTASEVPLTPRTNSPDGRRSSSDTS 
KSTYSLTRRISSLESRRPSSPLIDIKPIEFGVLSAKKEPIQPSV 
LRR T YNPDD Y FRKFE PHL YS LDSNS DDVDSL TDE EI LS K YQI/3M 
LHF S TQYDLLHNHLT VRV I EARDLP P P I S HDGS RQDMAHSNPYV 
KICLLiPDQKNS KQTGVKRKTQKPVFEERYTFE I PFLEAQRRTLfc 
LTWDFDKFSRHCVIGKVSVPLCEVDLVKGGHWWKAL1PSSQNE 
VELGELLLSLNYLPSAGRLNVDVIRAKQLLQTDVSQGSDPFVKI 
QLVHGLKLVKTKKTSFLRGTIDPFYNESFSFKVPQEELENASLV 
FTVFGHKMKSSNDFIGRIVIG\QYSSGP\SEPNHWRRMLNTHRT 
AVE QWHS LRS RAECD R VS P ASLE VT 


6971 


37 


3702 


ACF YVPGS RS FKLI PRHGLVNMGRSG KLP SG VS AKLKRWKKGHS 

SDSNPAICRHRQAARSRFFSRPSGRSDLTVDAVKLHNELQSGSL 

RLGKSEAPETPMEEEAELVLTEKSSGTFLSGLSDCTNVTFSKVQ 

RFWESNSAAHKEICAVLAAVTEVIRSQGGKETETEYFAALIRKA 

AQHGVCSVLKGSEFMFEKAPAHHPAAISTAKFCIQEIEKSGGSK 

EATTTLHMLTLLKDLLPCFPEGLVKSCSETLLRVMTLSHVLVTA 

CAMQAFHSLFHARPGLSTLSAELNAQIITALYDYVPSENDLQPL 

LAWLKVMEKAH I NLVRLQWDLGLGHLP RFFGTAVT CLLS PHS QV 

LTAATQSLKEILKECVAPHMADIGSVTSSASGPAQSVAKMFRAV 

EEGLTYKFHAAWS SVLQLLCVFFEACGRQAHP VMRKCI»QS LCDL 

RLS PHFPHTAALDQAVGAAVTSMGPE WLQAVPLE I DGSEETIoD 

FPRSWXiLPVIRDHVQETRLGFFTTYFLPIiANTLKSKAMDLAQAG 

STVESKI YDTLQWQMWTLLPGFCTRPTDVAI S FKGLARTLGMAI 

SERPDLRVTVCQALRTL I T KGCQ AEADRAEVS RFAKNFLP I LFN 

LYGQPVAAGDTPAPRRAVLETIRTYLTITDTQLVNSLLEKASEK 

VLDPASSDFTRLSVLDLWALAPCADEAAISKLYSTIRPYLESK 

AHGVQKKAYRVLEEVCASPQGPGALFVQSHLEDLKKTUjDSLRS 

TSSPAKRPRLKCLLHIVRKLSAEHKEFITALIPEVILCTKEVSV 

GARKNAFALLVEMGHAFLR FGSNQEE ALQCYIiVL I Y PGL VGAVT 

MVS CS I LALTHLLFEFKGLMGTSTVEQLLENVCLLIiASRTRDVV 

KSALGFI KVAVTVMDVAHLAKHVQLVMEAIGKLS DDMRRHFRMK 

LRNLFT\KFIPK\FGILTWGKKAVGPKEyHRVLVNIRKAEARAK 

RHRALSQAAVEEEEEEEEEEEPAQGKGDSIEEILADSEDEEDNE 

EEERSRGKEQRKLARQRSRAWLKEGGGDEPLNFLDPKVAQRVLA 

TQPGPGRGRKKDHS FKVSADGRLI IREEADGNKMBEEEGAKGED 

SEMAD PMEDVI I RNKKHQKLKHQKE AEEEELE I PPQYQAGGS G I 

HRPVAKKAMPGAEYKAKKAKGDVKKKGRPDPYAYIPLNRSKLNR 

RKKMKLQGQFKGLVKAAQRGS QVGH KNRRKDRR P 


6972 


2179 


973 


" PGGAI LL PLWRRTR PREATVP RGAAQRGRARSAEGK 1 PSSQS PS 
PAEAGGATRSPPPRPPRPARPPGPSAPPLLRSDAGPGATVSAAA 
AAAT ERARRGATMGAQLS TLGHMVLF P VWFLYS LLMKLFQRS TP 
A I TLES PD I KYPLRLI DRE 1 1 SHDTRRFRFALPS PQH I LGLP VG 
nUTVT ciutty-nt WRPYTPTSSDDDKGFVDLVIKVYFKDTHPK 
FPAGGKMSQYLESMQIGDTIEFRGPSGLLVYQGKGKFAIRPDKK 
SNPIIRTVKSVGMIAGGTGITPMLQVIRAIMKDPDDHTVCHLLF 
ANQTEKDILLRPELEELRNKHSARFKLWYTLDRAPEAWDYGQG\ 
FVNEEMIRDHLPPPE\EEPLVLMCGPPPMIQYACLPNL\DHVGH 

PTRRCFVF 


6973 


1 


1964 


" LQpRCAHRGLRAQKCGRPAPGVDAMVLCPVIGKLLHKRVVUAiiA 
S PRRQEILSNAGLRFEWPS KFKEKLDKASFATPYG YAMETAKQ 
KALEVANRLYQKDLRAPDWIGADTIVTVGGLILEKPVDKQDAY 
RMLSRFE / SGREHSVFTGVAI VHCSSKDHQLDTRVS E FYEETKV 
KFSELSEELLWEYVHSGEPMDKAGGYGIQALGGMLVESVHGDFL 
NWGFPLNHFCKQLVKLYYPPRPEDLRRSVKHDSIPAADTFBDL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, OCysteine, D*Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glyeine, 
H^Histidine, I^Isoleucine, K=Lysine, 
L^Leucine, M*Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine # 
S«Serine, TVThreonine, V«Valine, 
W=Tryptophan, Y«Tyroeine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
^possible nucleotide insertion) 








SDVEGGGSEPTQRDAGSRDEKAEAGEAGQATAEABCHRTRETLP 
P F PTRLLEL I EGFMLS KGLLTACKLKVFDLLKDEAPQKAAD I AS 
KVDASACGMERLLD I CAAMGLLEKTEQGYSNTETANVY LASDGE 
YSLHGF I MHNNDLTWNLFTYLE FAIREGTNQHHRALGKKAEDLF 
QDAYYQS PETRLR FMRAMHGMTKLTACQVATAFNLSRFS SACD V 
GGCTGALARELARE YPRMQVTVFDLPDI I ELAAHFQPPGPQAVQ 
I H FAAGDFFRDPLPS AELYVLCR I LHDWPDDKVHKLLS RVAES C 
KPGAGLLLVETLLDEEKRVAQRALMQSLNMLVQTEGKERSLGEY 
QCLLELHGFHQVQWHLGGVLDAIL\PPKWPPEAQAACSL 


6374 


3082 


2172 


RS CAAFAS FASRP PLELFAPPGSHRS P PGRG VATSAQCALSVRK 
LLAARPGLGTKYQATMVYKTLFALCILTAGWRVQSLPTSAPLSV 
SLPTNI VPPTTIWTSS PQNTDADTAS PSNGTHNNSVLPVTASAP 
TSLLPKNISIESREEEITSPOSNWEGTNTDPSPSGFSSTSGGVH 
LTTTLEEHSLGTPEAGVAATLSQSAAEPPTLISPQAPASSPSSL 
STS P PEVFSAS VTTNHS STVTS TQ PTGAPTAPES PTEES SSDHT 
PTSHATAEPVPQEKTPPTTVSGKVMCELIDMET\PPPFPG 


6975 


2 


500 


RPRPTVHCCKWALKLETAMETLINVFHAHSGKEGDKYKLSFCKEL 
KELLQTELSGFLDVKELML*ATEALKTFEEA* KSPI IQCSSSRS 
SLPPAPQPPPYL*LSAVPFPIHLPLPLLPPQAQKDVDAVDKVMK 
ELDENGDGEVDFQBYWLVAALTVACNNFFWENS 


6976 


1216 


970 


GCQL* VAYGTTENSPVTFAHFPEDTVEQKAESVGRIMPHTEARI 
MNMEAGTLAKLNTPGELCIRGYCVMLGYWGEPQKTEEAVDQDKW 
YWTGDVATMNEQGFCKIVGRSKDMIIRGGENIYPAELEDFFHTH 
PKVQEVQWGVKDDRMGEEICACIRLKDGBETTVEEIKAFCKGK 
ISHFKIPKYIVFVTNYPLTISGKIQKFKLREQMERHLNL*IKQQ 
ACPGRLA 


6977 


1298 


588 


SLFINTNLLSNQIRKTSFGMCSEPISDNTEDQKGKLKTPDFA*R 
ANKKSKHHVNGNRTVEPFPEGTQMAVFGMGCFWGAERKFWVLKG 
VYSTQVGFAGGYTSNPTYKEVCSEKTGHAEWRWYQPEHMSFE 
EliLKVFWENHDPTQGMRQGNDHGTQYRSAIYPTSAKQMEAALSS 
KENYQKVLS EHG FGP I TTD I REGQT F YYAED YHQQ YLS KNPNGY 
CGLGGTGVSCPVGIKK 


6978 


3 


242 


SFPFRDSRRCGCCKGSSLRHTAVAMVKLSKEAKQRLQQLFKGSQ 
FAI RWG F I P L V I YLG F KRG AD PGM PE PTVLSLLWG 


6979 


3917 


1146 


D EARVRGEAVAAAI LSR CRHWS GP P P FPPS P PDR KGLRG T E P WE 
AGPGSGATPGARAMDVRRLKVNELREELQRRGLDTRGLKTELAE 
RLQAALEAEEPDDERELDADDEPGRPGHINEEVETEGGSELEGT 
AQPP PPGLQPHAEPGG YSGPDGH YAMDNITRQNQFYDTQVI KQE 
NESGYERRPLEMEQQQAYRPEMKTEMKQGAPTSFLPPEASQLKP 
DRQQ FQ S RKR P YE ENRGRG YFEHREDRRGRS P Q P P AEEDEDDFD 
DTLVAI DTYNCDLrHFKVARDRS S G YPL TI EG FA YL WSGARAS YG 
VRRGRVCFEMKINEEISVKHLPSTEPDPHWRIGWSLDSCSTQL 
GEE P F S YGYGGTG KKS TNS RFENYGDKFAEND V I G CFAD FE CGN 
DVELS FTKNGKWMGI AFR I QKEALGGQALYPHVLVKNCAVE FNF 
GQRAE PYCSVLPGFTF I QHLPLSERIRGTVGPKS KAECE I LMMV 
GLPAAGKTTWAI KHAASN P S KKYN I LGTNAIMDKMRVMGLRRQR 
NYAGR WDVL I QQATQCLURL I QIAARKKRNYI LI^TNVYGSAQR 
RKMRPFEGFQRKAIVICPTDEDLKDRTXKRTDEEGKDVPDHAVL 
EMKANFTLPDVGDFLDEVLFIELQREEADKLVRQYNEEGRKAGP 
P PEKR FDNRGGGG FRGRGGGGG FQ R YE N RG P PGG NRGG FQN RGG 
GSGGGGNYRGGFNRSGGGGYSQNRWGNNNRDKNNSNNRGSYNRA 
PQQQ P P P QQP P P PQP P PQQ P PPP PS YS PARNP PGASTYN KNSNI 
PGSSANTSTPTVSSYSPPQSFGFFPSTFQPSYSQPPYNQGGYSQ 
GYTAPPPPPPPPPAYNYGSYGGYNPAPYTPPPPPTAQTYPQPSY 
NQYQQYAQOWNQYYQNQGQWPPYYGNYDYGSYSGNTQGGTSTQ ! 


6980 


1 


420 


GTRGRKTGRVAAPSTRRRTGNMQKLQTRS PAMSLSDPGLGYHPT 
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SEQ 
ID 

NO: 


Predicted 
beg i nn i ng 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(Hcftianine, v,»Lyateme, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M^Methionine, N^Asparagine, 
PaProline, Q=Glutamine / R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W^Tryptophan, Y-Tyroeine, X- Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 

\ ■nnnm hi p mir>1 ■! ^ne>A-vt--i«-ii-i\ 
\"^uoBiiij.c: nucicociue insertion/ 








CWTLRW P PLCSLHALHVFHCLFS SRLGTPVS PRLAMDPNCS CEA 
GGSCACAGSCKCKKCKCTSCKKSCCSCCPLGCAKCAQGCICKGA 
SEKCSCCA 


6981 


10 


1054 


PGRGFRRASLRPAFAARGVFQGGLGQAKQARTRACAALPTPHPS 
APRtiLEPQGVPSLFPPPPGPWPNMILTKAQYDEIAQCLVSVPPT 
RQSLRKLKQRFPSQSQATLLSIFSQEYQKHIKRTHAKHHTSEAI 
ESYYQRYLNGWKNGAAPVLLDIiANEVDYAPSLMARLILERFLQ 
EHEETPPSKSIINSMLRDPSQlPDGVLANQVYQdVNDCCYGPL 
VDC I KHAI GrtEHEVLLRDLLLEKNLSFLDEDQLRAKG YDKTPDF 
ILQVP VAVEGHI IHW I ESKAS FGDECSHHAYLHDQFWS YWNRFG 
PGL VI Y W YG F I QELDCNRERG I LLKAC F PTN I VTLCHS I A 


6982 


153 


1285 


FPQQDCSAPAAPGLAGSEPRRLRAYRRRRQRARGLKRVAWLAPP 
PSLLQGLQGWAQAPVDGTLGPEDSRASSPMIQNSRPSLLQPQDV 
GDTVETLMLHPVIKAFLCGS I SGTCSTLLFQPLDLLKTRLQTLQ 
PSDHGS RRVGMLAVLLKWRTESLLiGLWKGMS PS I VRCVPGVG I 
YFGTLYSLKQYFLRGHPPTALESVMLGVGSRSVAGVCMSPITVI 
KTR Y ESG KYG YES I YAALRS I YHS EGHRGLFS G LTATLLRDAP F 
SGIYLMFYNQTKNIVPHDQVDATLIPITNFSCGIFAGILASLVT 
QPADVIKTHMQLYPLKFQWIGQAVTLIFKDYGLRGFFQGGIPRA 

T.DDTT M5BMJ W r V\ T\TV t? MXA TS. VKfri T VO 


6983 


82 


773 


EMS FLQDPS F FTMGMWS I GAGALGAAALALLLANTDVFLS KPQK ' 
AALE YLED I D LKTLEKE PRTF KAKELWEKNGAV I MAVRRPGCFL 
CREEAADLSSLKSMLDQLGVPLYAWKEHIRTEVKDFQPYFKGE 
IFLDEKKKFYGPQRRKMMFMGFIRLGVWYNFFRAWNGGFSGNLE 
OfiOl* I Jj(jG VrWGSGKQGILLEHREKEFGDKVNLLSVLEAAKMI 
KPQTLASEKK 


6984 


1845 


1282 


GGRS AYSLPAGS LPRVPATAAAKMASG VQVADE VCRI F YDMKVR 
KCSTPEEI KKRKKAVI FCLSADKKCI I VEEGKE I LVGDVGVTIT 
D P FKH FVGML P E KDCR YAL YDAS FE TKESRKE E LM FFLWAP E LA 
PLKS KMI YAS S KDAI KKKFQG I KHECQANGPEDLNRACI AE KLG 

POT TVS j4T?rPD\? 


6985 


1887 


1324 


RRTAG I YPCFPKPGRTRHALCS WLLLLTGQLAFDD FQES CAMM 
WQKYAGSRRSMP LGAR I LFHGVFYAGGFAI VYYL I QKFHSRALY 
YKLAVEQLQSHPEAQEALGPPLNIHYLKLIDRENFVDIVDAKLK 
IPVSGSKSEGLLYVHSSRGGPFQRWHLDEVFLELKDGQQIPVFK 
LSGENGDEVKKE 


6986 


642 


1350 


YHLYFKMGD PN S R KKQALNR LRAQ LRKKKE S LADQ FD F KM Y I AF 
VFKEKKXKSALFEVSEVIPVMTNNYEENILKGVRDSSYSLESSL 
E LLQ KD WQLHAP R YQSMRRD VIGCTQ EMDFILW P RND I E KI VC 
LLFSRWKESDEPFRPVQAKFEFHHGDYEKQFLHVLSRKDKTGIV 
v ruxr «Uo v r L>v 1 JJKUHJjy i r&NKA 1 1 r K-bCo ICLYLPQEQLTHW 
AVGTIEDHLRPYKPE 


6987 


1623 


341 


LEAAEKAS RAFKES QRQTDS KNYET ENWS PQKSQRRYDMYNTAC 
FLGEIEVGLYTIQILQLTPFFHKENELSKKHMVQFLSGKWTIPP 
DPRNECYLALSKFTSHLKNLQSDLKRCFDFFIDYMVLLKMRYTQ 
KEIAEIMLSKKVSRCFRKYTELFCHLDPCLLQSKESQLLQEENC 
RKKLEALRADRFAGLLEYLNPNYKDATTMESIVNEYAFLLQQNS 
KKPMTNE K.QUS I LANI ILSCLKPNS KL IQPLTTLKKQLREVLQF 
VGLSKQ YPGP YFLACLLFWPENQELDQDS KLI EK YVS S LNRS FR 
GQYKRMCRS KQAS TLFYLGKRKGLNS I VHKAKIEQYFDKAQNTN 
SLWHSGDWKXNEVKDLLRRLTGQAEGKLISVEYGTEEKIKIPV 
I S VYSGPLRSGRNI ERVSFYLGFS IEGPPGL 


6988 


3 


689 


TQLLRRPAVFVGSAASGIRSGLWSASSGHWCAPAAGRAHAPVPR 
LVRGLGAASTAAPQDAQTGPQPMPRADCIMRHLPYFCRGQWRG 
FGRGS KQ LG I PTANF P EQ WDNLP AD I S TG I YYGWAS VGSGDVH 
KMWSIGWNPYYluTTKKSMETHIMHTFKEDFYGEILNVAIVGYL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C-Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=<3lycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
psProline, OGlutamine, R=Arginine, 
S=Serine, T^Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X-Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








RPEKNFDSLESLISAIQGDIBEAKKRLELPEHLKIKEDNFFQVS 
KSKIMNGH 


6989 


2 


1118 


LM PS DRP LS PSTHASAGS HCHAP PTTARRAFPI PFGS KSNMATL 
KDQLIYNLLKEEQTPQNKITWGVGAVGMACAISILMKDLADEL 
AL VD VI E D KLKG E MMD LQHG S L FLRTP KI VSG KD YNVTANS KLV 
I ITAGARQQEGESRLNLVQRNVNIFKFI IPNWKY3PNCKLLI V 
SNPVDILTYVAWKISGFPKNRVIGSGCNLDSARFRYLMGERLGV 
HPLSCHGWVLGEHGDSSVFVWSGMNVAGVSLKTLHPDLGTDKDK 
EQWKEVHKQWESAYEVIKLKGYTSWAIGLSVADLAESIMKNLR 
RVHPVSTMIKGLYGIKDDVFLSVPCILGQNGISDLVKVTLTSEE 
EARLKKSADTLWG I QKELQF 


6990 


719 


258 


THAS GMAS WLALRTRTAVTS LLS P T P ATALAVR YAS KKS GG S S 
KNLGGKSSGRRQGI KKMEGHYVHAGNI IATQRHFRWHPGAHVGV 
GKNKCLYALEEGIVRYTKEVYVPHPRNTEAVDLITRLPKGAVLY 
KTFVHWPAKPEGTFKLVAML 


6991 


169 


451 


RRSSDFHNPGFLSRPVSLRENIHHQVICSTKNKRRNPKKIAYLL 
SSLLNTTNLNPNESTENQPVDAYWAFTLDQEFLTYACVEGTGCLF 
CGRHVH 


6992 


944 


510 


RQ APGC S S LALRQ VRQ VY CGL VRAP QVQTR P LS S RFVE RRG AL Y 
RS PMNQENP P P YPGPGPTAPYP PYPPQPMGPGPMGGP YPPPQGY 
P YOG YPQ YG WQGG PQ E P P KTT V YWE DQRRDELG P S TC LT ACWT 
ALCC CCLWDMLT 


6993 


1 


374 


QWCVTCPQHNARQGPAVPPGIQAYGAAPFEDLQVDFTEMSKCRG 
DRVW I KNWNVASLCP LWKGPQTWLS P PTAVKVEGI PAW I HHSH 
VKPAARETWEARPSPDNPFRVTLKKTTSPAPVTPGS 


6994 


346 


1100 


QWPEKDPVMAASSISSPWGKHVFKAILMVLVALILLHSALAQSR 
RDFAPPGQOKREAPVDVLTQIGRSVRGTLDAWIGPETMHLVSES 
S S Q VLWAI S S AI S VAF FALSG I AAQ LLNALGLAG DYLAQGLKLS 
PGQVQTFLLWGAGALWYWLLSLLLGLVLALLGRILWGLKLVIF 
LAG FVALMRS V P DP S TRALLLLALL I LYALLS RLTGSRAS GAQL 
E AKVRGL ERQ VE ELR WRQR RAAKGARS VEE E 


6995 


144 


1346 


GS VAVGLSG I MAAQKDLWDAI VIGAG I QGCFTAYHLAKHRKR IL 
LLEQ FFL PHS RGSSHGQSR I IRKAYLEDF YTRMMHECYQ I WAQL 
EHEAGTQLHRQTGLLLLGMKENQELKTIQANLSRQRVEHQCLSS 
EELKQRFPNI RLPRGEVGLLDNSGGVI YAYKALRALQDAI RQLG 
G I VRDGEKWE INPGLLVTVKTTSRS YQAKSLVI TAGPWTNQLL 
R PLG I EMPLQTLR INVCY WR EM VPGS YGVSQAFPCFX WLG LCPH 
HIYGLPTGEYPGLMKVSYHHGNHADPEERDCPTARTDIGDVQIL 
SSFVRDHLPDLKPEPAVIESCMYTNTPDEQF1LDRHPKYDNIVI 
GAGFSGHGFKLAPWGKILYELSMKLTPSYDLAPFRISRFPSLG 
KAHL 


6996 


543 


1942 


ETANAEAAARKSAMDWKEVLRRRLATPNTCPNKKKSEQELKDEE 
MDLFTKYYSEWKGGRKNTNEFYKTIPRFYYRLPAENEVLLQKLR 
E E S RAVFLQRKS RELLDNE E LQNLW F L LDKHQTP PMI GEEAM IN 
YENFLKVGEKAGAKCKQFFTAKVFAiaLHTDS YGR I S IMQ F FNY 
VMRKVWLHQTRIGLSLYDVAGQGYLRESDLENYILELIPTLPQL 
DGLEKSFYSFYVCTAVRKFFFFLDPLRTGKIKIQDILACSFLDD 
LLELRDEELSKESQETNWFSAPSALRVYGQYLNLDKDHNGMLSK 
EELSRYGTATMTNVFLDRVFQECLTYDGEMDYKTYLDFVLALEN 
RKE PAALQYI FKLLDI ENKG YLNVFS LNYFFRAI QELMKI HGQD 
PVSFQDVKDEIFDMVKPKDPLKISLQDLINSNQGDTVTTILIDL 
NGFWTYENREALVANDSENSADLDDT 


$997 


370 


1104 


AMELTIFILRLAIYILTFPLYLLNFLGLWSWICKKWFPYFLVRF 
TVIYNEQMASKKRELFSNLQEFAGPSGKLSLLEVGCGTGANFKF 
YPPGCRVTCIDPNPNFEKFLIKSIAENRHLQFERFWAAGENMH 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D-Aspartic Acid, E*= 
Glutamic Acid, F- Phenyl alanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
p=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








QVADGSVDWVCTLVLCSVKNQERIliREVCRVLRPGGAFYFMEH 
VAAECS TWNY FWQQVLD PAWHLLFDGCNLTRE S WKALERAS FSK 
LKLQH I QAPLS WELVRPHI YG YAVK 


6998 


2 


616 


FVSRALLRVRSRRHPAEERAAPGRfcEDAPIECPGATNCPEPLWC 
SHLPVP YAP PTMESRGKSAS S PKPDTKVPQVTTEAKVPPAADGK 
AP LTKP S KKE AP AEKQQ P PAAPTTAP AKKTS AKAD P ALLNKHSN 
LKPAPTVPS S PDATPE PKGPGDGAE EDE AAS GG PGGRGP WS CEN 
FNPLLVAGGVAVAAIAL I LGVAFLVRKK 


6999 


14 


1591 


GRAGACSRRDTAMSIEIESSDVIRLIMQYbKENSLHRALATLQE 
ETTVSLNTVDS I ESFVADINSGHWDTVLQAIQSLKLPDKTLIDL 
YEQVVLELIELRELGAARSLIjRQTDPMIMLKQTQPERYIHLENL 
LARS YFDPREAYPDGS S KEKRRAAI AQALAGEVS WPPS RLMAL 
LGQALKWQQHQGLLPPGMTIDLFRGKAAVKDVEEEKFPTQLSRH 
IKFGQKSHVECARFSPDGQYLVTGSVDGFIEVWNFTTGKIRKDL 
KYQAQDNFMMMDDAVLCMCFSRDTEMLATGAQDGKIKVWKIQSG 
QCLRRFERAHSKGVTCLSFSKDSSQILSASFDQTIRIHGLKSGK 
TLKEFRGHSSFVNEATFTQDGHYIISASSDGTVKIWNMKTTECS 
NTFKSI^STAGTDITVNSVILLPKNPEHFWCKRSNTVVIMNMQ 
GQIVRSFSSGKREGGDFVCCALSPRGEWIYCVGEDFVLYCFSTV 
TGKLERTLTVHEKDVIGIAHHPHQNLIATYSEDGLLKLWKP 


7000 


2 


827 


GPGWFLELMESEGPPESERSEFFSQREEENEEEEAQEPEETGP 
KNPLLQPALTGDVEGLQKI FEDPENPHHEQAMQLLLEED I VGRN 
LLYAACMAGQSDVIRALAKYGVNLNEKTTRGYTLLHCAAAWGRL 
ETLKALVELDVDIEALNFREERARDVAARYSQTECVEFLDWADA 
RLT LKKY I AKVS LAVTDTE KGS GKLLKEDKNT I LS ACRAKNEW L 
ETHTEASINELFEQRQQLEDIVTPIFTKMTTPCQVKSAKSVTSH 

DQKRSQDDTSN 


7001 


2056 


844 


RRCLIIAFLKGCFIFIYFIFIFETEFLSCCPGWSAVAQSRLIAN 
FASQVQAIFILPKDSQVGPDVKSEAAPKRALYESVFGSGEICGP 
TS PKRLC I RPS EPVDAWWS VKHDPLPLLPEANGHRSTNS PT I 
VS P AI VS PTQD S R PNMSRPL I TRS P AS P LNNQG I PT P AQLT K SN 
APVHIDVGGHMYTSSLATLTKYPESRIGRLFDGTEPIVLDSLKQ 
HYFIDRDGQMFRYILNFLRTSKLLIPDDFKDYTLLYEEAKYFQL 
Q PMLLEMERW KQDRETGRFSRP CECLWRVAPDLGERI TLSGDK 
SLIEEVFPEIGDVMCNSVNAGWNHDSTHVIRFPLNGYCHLNSVQ 
VLERLQQRGFE I VGSCGGGVDS S QFS E YVLRRELRRTPRVP S VI 
RIKQEPLD 


7002 


1043 


498 


PMPSSTRWTTS*TYTDTSSAWACRPTTGTCT*TAAPGPTVRWWP 
TPCSRHQSRRRLTCWCSTSRPCGR*GGLCVRTAPTRPTTSASSS 
SWTSAGTSWPAGRRTGTATSGTATTTSVWPGCGTRMWSTQWSSV 
PRSRSCCSRPATTPPSKPGAPHAPCASSRHLAHGLAPSSPGLPA 
RGAEVC 


7003 


818 


61 


" OGRFRAFCWQRDFLQPPGMRLSALIJUASKVTLPPHYRYGMSPP 
G S VADKRKNP PW I RRRP VWE P I SDEDWYLFCGDTVE I LEGKDA 

LLHRQVXLVDPMDRKPTEI EWRFTEAGERVRVSTRSGRI I PKPE 
FPRADGIVPETWIIXSPKiyrSVEDALERTYVPCLKTLQEEVMEAM 
GIKETR\NTRRSIGIEPGAEQLLPNFCPSLEG 


7004 


121 


2285 


FLLPVLTSRSLRQPAVPHARLGGVEPAAMKSARAKTPRKPTVKK 
G \ P KRTLKTQLG / Y YCRVR PLGFPDQBCCIEV INNTTVQLHT P E 
GYRLNRNGDYKETQYSFKQVFGTHTTQKELFDWANPLVNDLIH 
GKNGLLFTYGVTGSGKTHTMTGSPGEGGLLPRCLDMIFNSIGSF 
QAKRYVFKSNDRNSMDIQCEVDALLERQKREAMPNPKTSSSKRQ 
VD PEFADM I TVQE FCKAEE VDEDS VYGVFVS YIE I YNNYI YDLL 
EEVPFDPINPNLHNl^CFVKIKNHNMYVAGCTEVEVKSTEEAFE 
VFWRGOKKRRIANTHLNRESSRSHSVFNIKLVQAPIiDADGDNVL 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F* Phenylalanine , G«Glycine, 
H^Histidine, I-Isoleucine, K«Lysine, 
L»= Leucine, M=Methionine, N=Asparagine , 
P^Proline, Q=Glutamine, R^Arginine, 
S^Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y-Tyroeine, X-Unknown, +-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








QE KEQ I T I SQLSLVDLAGS ERTNRTRAEGNRLREAGN INQSLMT 
LRTCMDVLRENQMYGTNKMVP YRDS KLTHL F KNYFDG EGKVRM I 
VCVNP KAED YEENLQ VMR FAEVTQEVEVAR PVDKAI CGLTPGRR 
YRNQPRGP\IGNEPLVTDWLQSFPPLPSCEILDINDEQTLPRL 
I E ALEKRHNLRQMM I DE FNKQSNAFKALLQE FDNAVLS KENHMQ 
GKLNE KEKMISGQKLE I ERLEKKNKTLE YKI E ILEKTTTI YEED 
KRNLQQELE TQNQ KLQRQ FS DKRRLE ARLQGMVTETTMKWEKEC 
ERRVAAKQLEMQNKLWVKDEKLKQLKAIVTEPKTEKPERPSRER 
DREKVTQRSVSPSPVPVSYL 


7005 


63 


876 


PJJMALYQRWRCLRLQGLQACRLHTAWSTPPRWLAERLGLFEEL 
WAAQVKRLASMAQKEPRTIKISLPGGQKIDAVAWNTTPYQLARQ 
I S S TLADTAVAAQVNGE P YDLERPLETDSDLRFLT FDS PEGKAV 
FWHSSTHVLGAAAEQFLGAVLCRGP STE YGFYHDFFLGKERTI R 
G S E LP VLE R I CQE L TAAARP FRRLEAS RDQLRQLFKDNP FKLHL 
I EEKVTGPTATVY G CX3TLVDLCQGPHLRHTGQIGGLKLIiSNS S S 
LWRSSG 


7006 


22 


898 


NAFGRHSTAVKMAAAAWLQVLPVILLLLGAHPSPLSFFSAGPAT 
VAAADRSKWHIPIPSGKNYFS FGKILFRNTT I FLKFDGEPCDLS 
LNITWYLKSADCYNEIYNFKAEEVELYLBKLKEKRGLSGKYQTS 
S KLFQNCS ELFKTQTFSGDFMHRLPLLGEKQEAKENOTNLTFIG 
DKTAMHEPLQTWQDAPYrFIVHIGrSSSKESSKENSIiSNLFTMT 
VEVKGP YE YLTLEDYPLMI FFMVMC I VYVLFGVLWLAWS ACYWR 
DLLRI Q FW I GAV I F LGMLEKAVF YAGFQ 


7007 


2 


1001 


AMTVS G PG T P E PR P ATPG AS S VEQ LRKEGNE L FKCGD YGGALAA 
Y TQALG LDATPQDQAVLHRNRAACHLiKLE D YDKAE TEAS KAI E K 
DGGDVKALYRRSQALEKLGRLDQAVLDLQRCVSLEPKNKVFQEA 
LRN I GG Q I Q EKVR YM S STDAKVEQM FQ I LLD P E EKGTE KKQKAS 
QNLVVLAREDAGAEKI FRSNGVQLLQRLLDMGETDLMLAALRTL 
VG I CSEHQSRTVATLS I LGTRRWS I LGVES QAVSIAACHLLQV 
MFDALKEGVKKGFRGKEGAIIVGEWKQVWGLLDVTVMEGMGLSQ 
PGQFFGDQTCSCRLFGIRFGDI ILL 


7008 


70 


1478 


CRS ALGHERP PPAHLPAGGRRLQTCPRS CRWLGRP P SGLPPGPR 
S PP PLAGPGQKMVQKKPAELQGFHRS FKGQNP FELAPSLDQPDH 
GDSDFGLQCSARPDMPASQP I D I PDAKKRGKKKKRGRATDS FSG 
RFEDVYQLQEDVLGEGAHARVQTCINL ITSQE YAVKI IEKQPGH 
I RSRVFREVEMLYQCQGHRNVLELI EF FEEEDRFYLVFE KMRGG 
S I LSH IHKRRHFNELE AS VWQDVAS ALDFLHNKG I AHRDLKPE 
N I LCEHPNQVS P VK I CDFDLGSG I KLNGDCS P I STPELLTPCGS 
AEYMAPEWEAFSEEASIYDKRCDLWSLGVILYILLSGYPPFVG 
RCGSDCGWDRGEACPACQNMLFESIQEGKYEFPDKDWAHISCAA 
KDLISKLLVRDAKQRLSAAQVLQHPWVQGCAPENTLPTPMVLQR 
WDSHFLLPPHPCRIHVRPGGLVRTVTVNE 


■ 7009 


1 


626 


ARQLRNSWVDDFVAAPLIPLSQQIPTGNSLYESYYKQVDPAYTG 
R VG ASEAAL FLKKS GLS D I ILG KI WDLAD PEGKG FLDKQG F YVA 
LRLVACAQ SGHE VTLSNLNLS M P P P KFHDT SS P LMVTPPS AEAH 
WAVRVEEKAKFDG I FESLLP I NGLLSGDKVKPVLMNSKLPLDVL 
GRVWDLSDIDKDGHLVRDEFAVAMHLVYRALE 


7010 


! 79 


571 


SHTRRAWPETLLSPLCPLLGGGTAMSGGEQKPERYYVGVDVGT 
GS VRAALVDQSGVLLAFADQP I KNWEPQFNHHEQSS ED I WAACC 
WTKKWQG I D LNQ I RGLGFD ATCS L WLD KQFHPL P VNQ EGDS 
HRNVIMWLDHRAVSQVNRINETKHSVLQYVGG 


7011 


3 


994 


riOtlpnqnqsqtqpllktppavlqpiapqttfgvqtqpqpqsl 

LQ AQ I S AAS I T PLLQTQ PQPLLQQ PQQ KAG LLQ P PVR I VS QPQ P 
ARRLD P P S R FS GRNDRGDQ VPNRKDDR SRERERERRRSRERS PQ 
RKRSRERS PRRERERS PRRVRRVVPRYTVQFSKFSLDCPSCDMM 
ELRRRYQNLY I PSD FFDAQFTWVDAFPLSRPFQLGNYCNFYVMH 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A»Alanine, C-Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
PsProline, Q-Glutamine, R=*Arginine, 
S^Serine, TsThreonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *-Stop 
Codon, /"possible nucleotide deletion, 
\*possible nucleotide insertion) 








RE VESLEKNMAI LDP PDADHLYS AKVMLMAS PSMEDLYHKSCAL 
AEDPQ E LRDG FQHP ARL VKFL VGMKGKDEAMA I GGHWS PSLDGP 
DPEKDPSVLIKT\AIRCCKALTG 


7012 


1 


2661 


RRAGSVKRGEARLFGPTERQSERPLRPSAARRPBMLSGKKAAAA 
AAAAAAAATGTEAG PG TAGGS ENG S E VAAQ PAGLSGPAE VGPGA 
VGERTPRKKE PPRAS PPGGLAEP PGS AGPQAGPTWPGS ATPME 
TGIAETPEG\RRTSRRKRAKVEYREMDESLANLSEDEYYSEEER 
NAKAEKEKKLPPPPPQAPPEEENESEPEEPSGVEGAAFQSRLPH 
DRMTSQEAACFPDIISGPQQTQKVFLFIRNRTLQLWLDNPKIQL 
TFEATLQQLE APYNS DTVLVHRVHS YLERHGL INFGI YKRIKPL 
PTKXTGKVI I IGSGVSGLAAARQLQSFGMDVTLLEARDR VGGRV 
ATFRKGNYVADLGAMWTGLGGNPMAWS KQVNMELAKI KQKCP 
LYEANGQAVPKEKDEMVEQEFNRLLEATSYLSHQLDFNVIiNNKP 
VSLGQALEWIQLQEKHVKDEQIEHWKKIVTCrQEELKELIiNKMV 
NLKEKIKELHQQYKEASEVKPPRDITAEFLVKSKHRDLTALCKE 
YDELAETQGKLEEKLQELEANPPSDVYLSSRDRQILDWHFANLE 
FANATPLS TLS LKHWDQDDDFE FTGSHLTVRNGYSCVPVALAEG 
LDIKLNTAVRQVRYTASGCEVIAVNrRSTSQTFIYKCDAVLCTL 
PLGVLKQQ P PAVQF VP PLPEWKTSAVQRMGFGNLNKWLCFDRV 
FTOPSVNLFGHVGSTTASRGELFLFWNLYKAP I LLALVAGEAAG 
I MEN I SDD V I VG RCLA1 LKG I FGS S AVPQPKETWS R WRAD P WA 
RG S YS YVAAG S S GND YDLMAQ P I TPGPS I PG APQ PI PRL FFAG E 
HTIRNYPATVHGALLSGLREAGRIADQFLGAMYTLPRQATPGVP 
AQQSPSM 


7013 


1 


2661 


RRAGSVKRGEARLFGPTERQSERPLRPSAARRPEMLSGKKAAAA 
AAAAAAAATGTEAG PGTAGG SENGS E VAAQ PAG LS GPAEVG PGA 
VGERTPRKKE PPRAS PPGGLAEP PGS AGPQAGPTWPGS ATP ME 
TGIAETPEG\RRTSRRKRAKVEYREMDESLANLSEDEYYSEEER 
NAKAE KEKKLP P P PPQAPPEEENESEPEEPSGVEGAAFQSRLPH 
DRMT S QEAAC F P D 1 1 SG PQQTQKVFLF I RNRTLQLWLDNP K I QL 
TFEATLQQLEAPYNSDTVLVHRVHS YLERHGL INFGI YKRIKPL 
PTKKTGKVI I IGSGVSGLAAARQLQSFGMDVTLLEARDRVGGRV 
AT FRKGNYVAD LG AMVVTGLGGNPMAVVS KQVNMELAKI KQKCP 
LYEANGQAVPKEKDEMVEQE FNRLLEATSYLSHQLDFNVLNNKP 
VS LG QALEWI QLQEKHVKDEQ I EHWKKI VKTQE ELKELLNKMV 
NLKE KI KELHQQ YKEAS E VKP PRD I TAE FLVKS KHRDLTAL C KE 
YDELAETQGKLEEKLQE LEANPPSDVYLSSRDRQ I LDWHFANLE 
FANATPLS TLS LKHWDQDDDFEFTGSHLTVRNG YS CVP VALAEG 
LD I KLNTAVRQVRY TASGCE V I AVNTRS TS QT F I YKCDAVLCTL 
PLG VL KQQP P AVQ F VP P L PEW KTS AVQRMGFGNLNKWL C FDR V 
FWD PS VNLFGHVGS TTASRGELFLFWNL YKAP I LLALVAGEAAG 
IMEN I S DDVI VGRCLA ILKGI FGSSAVPQPKETWSRWRADPWA 
RGSYSYVAAGSSGNDYDLMAQPITPGPSIPGAPQPIPRLFFAGE 
HTIRNYPATVHGALLSGLREAGRIADQFLGAMYTLPRQATPGVP 

rt^yu rorl 


7014 


3 


3950 


DFEVGDKIRILATLEDGWLEGSLKGRTGIFPYRFVKLCPDTRVE 
ETMALPQEGSLARIPETSLDCLENTLGVEEQRHETSDHEAEEPD 
CI ISEAPTSPLGHLTSE YDTDRNS YQDEDTAGGPPRS PGVEWEM 
PLATDS P TSDP TEWNG I SSQPQ VP FHPNLQ KS Q YYS TVGGS HP 
HSEQYPDLLPLEARTRDYASLPPKRMYSQLKTLQKPVLPLYRGS 
SVSASRWKPRQSSPQLHNLASYTKKHHTSSVYSISERLEMKPG 
PQAQGLVMEAATHSQGDGSTDLDSKLTQQL I EFEKS LAGPGTE P 
DKILRHFSIMDFNSEKDIVRGSSKLITEQELPERRKALRPPPPR 
PCTPVSTSPHL L VDQNLKPAP P LWR P S R P APL P PS AQQRTNAV 
SPKLLSRHRPTCETLEKEGPGHMGRSLDQTSPCPLVLVRIEEME 
RDLDMYSRAQEELNLMLEEKQDESSRAETLEDLKFCESNIESLN ' 
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ID 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E- 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine , N=Asparagine, 
P*Proline, OGlutamine, R=Arginine, 
S=Serine, T*Threonine, V=Valine, 
W=Tryptophan, Y*Tyrosine, X -Unknown, *«Scop 
Codon, /^possible nucleotide deletion, 
\=*possible nucleotide insertion) 








MELQQLRBMTLLSSQSSSLVAPSGSVSAENPEQRMLEKRAKVIE 
ELLQTERD Y I RDLEMC I ERIMVPMQQAQVPN I DFEG LFGNMQMV 
I KVS KQLIiAALE I SDAVG PVF LGHRDE LEGT YKI YCQNHDEAI A 
LLEIYEKDEKIQKHLQDS LADLKS L YNE WGCTNY INLG S FL I KP 
VQRVMRYPLLLMELLNSTPESHPDKVPLTNAVLAVKEINVNINE 
YKRRKDLVLKYRKGDEDS LME KI S KLNIHS 1 1 KKSNRVS SHLKH 
LTGFAPQI KDE VFEETE KNFRMQERL I KS FI RDLSLYLQH I RES 
ACVKWAAVSMWDVCMERGHRDLEQFERVHRYISDQLFTNFKER 
TERLVISPLNQLLSMFTGPHKLVQKRFDKLLDFYNCTERAEKLK 
DKKTLEELQSARNNYEALNAQLLDELPKFHQYAQGLFTNCVHGY 
AEAHCDFVHQALEQLKPLLSLLKVAGREGNLIAIFHEEHSRVLQ 
QLQVFTFFPESLPATKKPFERKTIDRQSARKPLLGLPSYMLQSE 
ELRASLLARYPPEKLFQAERNFNAAQDLDVSLLEGDLVGVIKKK 
DPMGSQNRWL I DNGVTKG FVYSS FLKPYNPRRSHSDAS VGSHSS 
TESEHGSSSPRFPRQNSGSTLTFNPN\S\MAVSFTSGSCQKQPQ 
DASPPPKEWDQGTLSASLNPSNSESSPSRCPSDPDSTSQPRSGD 
S ADVARD VKQ P TATPRS YRNFRH PE I VGYS VPGRNGQSQDLVKG 
CARTAQAPEDRSTEPDGSEAEGNQVYFAVYTFKARNPNELSVSA 
NQKLKrLEFKD VTGNTE WWIiAEVNGKKG YVPSNY I RKTEYT 


7015 


1842 


513 


RQAWHE\VAAPSWRGARLVQSVLRVWQVGPHVARERVIPFSSLL 
G FQRRCVS C VAGS AFSG PRLAS AS RS NGQG S ALDH FLG FS Q PDS 
SVTPCVPAVSMNRDEQDVLLVHHPDMPENSRVLRWLLGAPNAG 
KSTLSNQLLGRKVFPVSRKVHTTRCQALGVITEKETQVILLDTP 
GIISPGKQKRHHLELSLLEDPWKSMESADLVWLVDVSDKWTRN 
QLSPQLLRCLTKYSQIPSVLVMNKVDCLKQKSVLLEIjTAALTEG 
WNGKKLKMRQAFHSHPGTHCPSPAVKDPNTQSVGNPQRIGWPH 
FKE I FMLSALSQEDVKTLKQ YLLTQAQPGPWE YHS AVLTSQTPE 
E I CANI I REKLLEHLPQE VP YNVQQKTAVWEEG PGG EhVIQQ KL 

LVPKESYVKLLIGPKGHVISQIAQEAGHDLMDIFLCDVDIRLSV 
KLLK 


7016 


167 


2513 


I LNAP KPPPPRDS VEAVAAKRDTGGGS WGTGMDVSGQETDWRST 
AFRQKLVSQI EDAMRKAGVAHS KSS KDMESHVFLKAKTRDE YLS 
LVARLIIHFRDIHNKKSQASVSDPMNALQSLTGGPAAGAAGIGM 
PPRGPGQSLGGMGSLGAMGQPMSLSGQPPPGTSGMAPHSMAWS 
TATPQ TQLQLQQ VAAAAAAATARS S S S S S RRR YS S S S S S SNS KQ 
FQAQQS AMQQ\QFQA \ WQQQQQL\QQQQQQQQHL I KLHHQNQQ 
QIO^X^QQLQRIAQLQLQQO^QQQQQQQQQQQALQAQPPIQQP 
PMQQPQPPPSQALPQQLQQMHHTQHHQPPPQPQQPPVAQNQPSQ 
LPPQSQTQPLVSQAQALPGQMLYTQPPLKFVRAPMWQQPPVQP 
QVQQQQTAVQTAQAAQMVAPGVQVSQS SLPMLSSPS PGQQVQTP 
QSMPPPPQPSPQPGQPSSQPNSNVSSGPAPSPSSFLPSPSPQPF 
XQSPVTARTPQNFSVPSPGPLNTPVNPSSVMSPAGSSQAEEQQY 
LDKLKQLSKYIEPLRRMINKIDKNEDRKKDLSKMKSLLDILTDP 
SKRCPLKTLQKCEIALEKLKNDMAVPTPPPPPVPPTKQQYLCQP 
LLDAVLAN I R SPVFNHS L YRTF VPAMTA I HGPP I TAP WCTRXR 
RLEDDERQSIPSVI^GEVARLDPKFLVNLDPSHCSNNGTVHLIC 
KLDDKDLPSVPPLELSVPADYPAQSPLWIDRQWQYDANPFLQSV 
HR CMTSRLLQLPDKHS VTALLNTWAQS VHQACLS AA 


7017 


1 


1785 


INLGNTCYMNSVI*ALFMATDFRRQVLSLNLNGCNSLMKKLQHL 
FAFLAHTQREAYAPRIFFEASRPPWFTPRSQQDCSEYLRFXLDR 
LH E E E K I LKVQASH KP S E I LE CS ETSLQEVAS KAAVLTETPRTS 
DGEKTLIEKMFGGKLRTH1RCLNCRSTSQKAEAFTDLSLAFWPS 
YSLEYMSCPDCSQSPSIQDGGIjMQASVPGPSEEPWYNPTTAAF 
ICDSLVNEKTIGSPPNEFYCSENTSVPNESNKILVNKDVPQKPG 
GETT PS VTDLLNYFLAPE ILTGDNQYYCENCAS LQNAEKTMQ I T 
EEPEYLILTLLRFSYDQKYHVRRKILDNVSLPLVLELPVKRITS 
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ID 
NO: 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end' 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A-Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, • 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y»Tyrosine, X- Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








FSSLSESWSVDVDFTDLSENLAKKLKPSGTDEASCTKLVPYLLS 
SVWHSGISSESGHYYSYARNITSTDSSYQMYHQSEALALASSQ 
SHLLGRDS PSAVFEQDLENKEMS KEWFLFNDSRVTFTSFQSVQK 
ITSRFPKDTAYVLLYKKQHSTNGLSGNNPTSGLWINGDPPLQKE 
LMDAITKDNKLYLQEQELNARARALQAASASCSFRPNGFDDNDP 
PGSCGPTGGGGGGGFNTVGRLVF 


7018 


464 


1066 


SLVFRGNTWSGEAGHHCSALFNLAAYHQLFVGTERIRAPEI I FQ 
PSLIGEEQAG I AETLQ Y I LDR YPKDVQEMLVQNVFLTGGNTMYP 
GMKARME KE LLE MRP FRS S FQVQLASN P VLDAWYGARDWALNHL 
DDNEVW I TRKE YEEKGGE YLKEHCASN I YVP I RLPKQASRS S DA 
QAS S KGS AAGGGGAGEQA 


7019 


1048 


335 


APGGFLVTMVFPAPS P PWMLGCCSHEVTAGPPTLCKDMSALVAA 
RMRHIPLAPGSDWRDLPNIEVRLSDGTMARKLRYTHHDRKNGRS 
SSGALRG VCS CVEAGKACDPAARQ FNTL I PWCLPHTGNRHNHWA 
GL YGRLEWDG F FSTTVTNPE PMGKQGR VLH P EQHR WS VRE CAR 
SQGFPDTYRLFGNILDKHRQVGNAVPPPLAXAIGLEIKLCMIjAK 
ARESASAKIKEEEAAKD 


7020 


1 


21S4 


FADSKRKSVLLDKIKNLQVALTSKQQSLETAMSFVARNTFKRVR 
NGFLMRKVAVF F SNTPTRAS P QLREAVLKLSDAG I TPLFLTRQE 
DRQLINALQ INNTAVGHALVLPAGRDLTDFLENVLTCHVCLDIC 
NI D PS CG FGS WRPS FRDRRAAG SDVD I DMAF I LDSAETTTLFQF 
NEMKKYIAYLVRQLDMSPDPKASQHFARVAWQHAPSESVDNAS 
MPPVKVEFSLTDYGSKEKLVDFLSRGMTQLQGTRALGSAIEYTI 
ENVFESAPNPRDLKIWLMLTGEVPEQQLEEAQRVILQAKCKGY 
FFWLGIGRKVNIKEVYTFASEPNDVFFKLVDKSTELNEEPLMR 
FGRLLPSFVSSENAFYLSPDIRKQCDWFQGDQPTKNLVKFGHKQ 
VNVPNNVTS S PTSNP VTTTKP VTTTKPVTTTTKP VTTTTKP VT I 
I NQ PS VKPAAAKPAPAKP VAAKP VATKTATVR P P VAVKPATAAK 
PVAAKPAAVRP P AAAAAK P VATKP E VPR P Q AAKP AATKP ATT K P 
MVKMSREVQVFEITENSAKLHWERPEPPGPYFYDLTVTSAHDQS 
LVLKQNLTVTDRVIGGLLAGQTYHVAVVCYLRSQVRATYHGS FS 
TKKSQPPPPQPARSASSSTINLMVSTEPIiALTETDICKLPKDEG 
TCRDF I LKWYYDPNTKS CARFW YGGCGGNENKFGSQKECEKVCA 
PVLAKPGVISVMGT 


7021 


2 


338 


VNAVS FFPNGYAFATGSDDATCRLFDLRADQELLLYSHDNI ICG 
ITS VAFS KS GRIjLIjAG YD D FNCNVWDTLKGDRAGVLAGHDNR VS 
CLGVTDDGMAVATGS WDS F LR I WN 


7022 


2 


856 


VYIGS FWSHPLLI PDNRKL FEAEEQDLFRD I QS LPRNAALRKLN 
DLI KRARLAKVHAYI I S S LKKEMPS VFGKDNKKKELVNNLAEI Y 
GRIEREHQI S PGDFPNLKRMQDQLQAQDFSKFQPLKSKLLEWD 
DMLAHD IAQLMVLVRQEESQRPIQMVKGGAFEGTLHGPFGHGYG 
EGAGEGIDDAEWVV7U?DKPNrrTJEIFYTI>SPVTXjKrrGANAKKEM 
VRSKLPNSVLGKIWKLADIDKDGMLDDDEFALANHLIKVKLEGH 
ELPNELPAHLLPPSKRKVAE 


7023 


2 


748 


AMVFGG WP Y VPQ YRD I RRTQNADGFS TYVCLVL L VANI LR I LF 
WFGRRFESPLLWQSAIMILTMLLMLKLCTEVRVANELNARRRSF 
TAADSKDEEVKVAPRRSFLDFDPHHFWQWSSFSDYVQCVLAFTG 
VAGYI TYLS I DS AL FVETLG FLA VLTEAMLG VPQL YRNHRHQS T 
EGMS I KMVLMWTSGDAFKTAYFXLKGAPI^FSVCGLI^VLVDLA 
I LGQAYAFARHPQKPAPHAVHPTGTKAL 


7024 


1207 


190 


RTG VTG WAQ VWMFGGGG VLS SGEQLQMPVKPERG LGPS DG WL V 
SSRRGSPGTVLGLPFWLLTPVLVSRSIRSMLLLTRSPTAWHRLS 
QLKPPVLPGTLGGQALHLRSWLLSRQGPAETGGQGQPQGPGLRT 
RLL ITGLFGAGLGGAWLALRAEKERLQQQKRTEALRQAAVGQGD 
FHLLDHRGRARCKADFRGQWVLMYFGFTHCPDICPDELEKLVQV 
VRQLEAE PGLP P VQPVFI TVDPERDDVEAMARYVQDFHPRLLGL 
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Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G*Glycine, 
H*Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
PsProline, OGlutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown , *-Stop 
Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 








TGSTKQVAQASHS YRVY YNAGPKDEDQDY I VDHS I AI YLLNPDG 
LFTDYYGRSRSAEQ1SDSVRRHMAAFRSVLS 


7025 


232 


832 


ERKSP I GNNENL* K \ HSLDCLCFRGDWEGNTQFQTLQDNQEE CF 
KQ V I RTCE KR PTFNQHTVFNLHQRLNTGD KLNEFKELGKAF I S G 
SDHTQHQLIHTSEKFCGDKECGNTFLPDSEVIQYQTVHTVKKTY 
ECKECX3KSFSLRSSLTGHKRIHTGEKPFKCKDCGKAFRFHSQLS 
VHKRIHTGEKSYECKECGKAFSCG 


7026 


328 


1146 


NPNPSIGDIKDIKKAAKSMLDPAHKSHFHPVTPSLVFLCFIFDG 
LHQALLSVGVSKRSNTWGNENEERGTPYASRFKDMPNFIALEK 
S S VLRHCCDLL I GVAAGS SDKI CTS SLQVQRRFKAMMAS IGRLS 
HGESADLLISCNAESAIGWISSRPWVGELMFTFLFGDFESPLHK 
LRKS S *LPRKHR * QP INAVRMFLDQCMDGS I ALRAI VS E I PVFE 
EKKNNG* KG IGE I F * VWGCTLPPHYWGAVTTNVPKLSNSGKIjIiG 
QDEQPHIFG 


7027 


43 


954 


GRRLQQQQRPEDAEDGAEGGGKRGEAGWEGGYPEIVKENKLFEH 
YYQELKI VPEGE WGQFMDALREPLPATLR I TG YKSHAKE I LHCL 
KNKYFKEIjEDLEMDGQKVEVPQPLSWYPEELAWHTNLSRKILRK 
S PHLE KFHQ FLVSETE SGNISRQEAVSMI P PLLLNVRPHHKILD 
MCAAPGSKTTQLIEMLHADMNVPFPEGFVIANDVDNKRCYLLVH 
QAKRLS S P CI MWNHDAS S I PRLQ I DVDGRKE I LFYDRI LCDVP 
CS GDGTMR KN I D VWKKWTT LNS LQ LHGLQLR I ATRGAEQL 


7028 


189 


608 


SRPPPEPEPGTMVEKGSDSSSEKGGVPGTPSTQSLGSRNFIRNS 
KKMQSV^YSMLSPTYKQRNEDFRKLFSKLPEAERLIVDYSCALQR 
EI LLQGRL YLS ENWI CFYSN I FRWETTIS IQLKEVTCLKKEKTA 
KLIPNAIQ 


7029 


1343 


40 


VL E S NTEAKQATGTS S KLRHGTGQ E KGREG PRC P SGLAQLRLWG 
/ P CPHAGRE TG PRAS AP I PGS * GHGWHW * RKDGRGERS EG PSAL 
S PHS P S LLNMQQ APTHVG PGMGSQR P R S SWP E QVGVGS QLS RE 
RWRA* RSLPGAAASERTEMTKERSP /RPCQGYDSSNWFTQPGKK 
TRKRNSRRNTMVSRGGGCIjLYPLQSIMPE*QLR*GAHASPPTQG 
R*GKGGPRSPLTKASGTTHI PTPFFGS I P/RPTRDSGPGTDNS \ 
AAPGQKRGHREA* QGPEPV/ WGRVTTHLQGPAG * TKPLGS \RNW 
VPGPAEGEQGEGAGLEGRP * PLKGCRS TLTF S PQLS I PMVGKKP 
PEGTTASFFP\RSCHSE*RKPPPSCPHAPALSLPHPIjPLPLPPL 
PLP L PGAGT * HS ARS GRPGQ S ETGS LCHNCHHC P PHCPKCS PGG 
T 


7030 


2 


521 


FVC FS APGSGQGG KRRVKME L S AVGE R V FAAE ALLKRR I R KGRM 
EYLVKWKGWSQKYSTW EPE EN I LDARLLAAFEEREREMELYGPK 
KRGP KP KTFLLKAQAKAKAKT YEFRSDSARGIR I P YPGRS PQDI> 
ASTSRAREGLRN\ RVCPRQRAAP APAAP\PRRGP SGPGPRPG * G 
PGLHFPGPGGPSKHGFVPASEQHQHQQHLPRRGPSGPGPRPG 


7031 


960 


59 


HCSVPGAEWPRKPPAQICPQLTSRPHLSSPRSLSPGCGHSPGPG 
/CKPS/RHCDELHEGPSRTAALPCGKPQPKHGVEECG/PCPCLA 
PRRLTEPPALTVS P VGRAAPSGAL* PSGRACSACSHRLAPEAAL 
SAAAPRPSLGSGQNASGLPAASLPPQDSSQPHKTVPSPARSVPP 
IiGAQARAAPPRLWCPRALVS G * EAS PEAVS VAAGPPVPGPTPS T 
SGSTASHSRRGC*SPR*TPAPPRRDHGRSAAFEVLTAAASAQPC 
ASQGGPRPTGAGRTPSPLGLPFSRGPPAASARPFCRHPSL 


7032 


1393 


2104 


RRPGRTEPVEPPPVPPPPRASNSKSRCR*RNLHLAPL*QSPLRK 
SRQIGTSSLPFGRSAGERPRPAATFCLSRGGSSPVFL*PSSSSL 
E PWMKRQFGRLHS L F WKS WQKMNS FLLTPKLDTS LMSG WRYRQR 
LPRLHTFLKKSLQMASELAPPLPTPAPLASSLPPPPGPPPLLPV 
PLA*LSRSGILVPPNSGFSLSC\PLGDH*GSSGEVRGSCGSPPP 
HHCWVLPPPP*LLLPPR 


7033 


689 


815 


RSRDCLSSSATSNRARRSKCSGPKRATPLDSGPGP*APPGPSSA 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid. E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=:Lysine, 
L^Leucine, M=Methionine , N»Asparagine , 
P=Proline, Q=Glutamine, R»Arginine, 
S=Serine, T=Threonine, V-Valine, 
W-Tryptophan, Y«Tyrosine, X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion,, 
\=possible nucleotide insertion) 








LMMPSSCPWRTGALGPSPAGSRALGRCTSSVGPGSRWLTRTSSP 
GCATRTWRTMRMEPRPLRSRMGESAPGIPAELPSAAPSGPSAPS 
AAAPSAPTTPAAAGPNTL*SRRTAEWCWPPSCSCCWGWC*SWSA 
WDWRRPPLQVS PAPS S S CRAS CC WCLES I T* S S S TARSRATGAS 
SS STCPTSRSDRGAAWTP \SPMGAPLLPCS VPL I SREEALQDPR 
NPS P* GVCSGSSGHAGIALGKPPVACS VP 


7034 


92 


1942 


EDTSSMPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERV 
KAMFYHAYDS YLENAPP FDELRPLTCDGHDTWGS FSLTL I DALD 
TLIATLFYFQI LGNVSE FQRWEVLQDSVDFDI DVNAS VFETNI 
RWGGLLS AHLLS K KAGVEVE AGWP CS GP LLRMAE EAARKLLP A 
FQT PTGMP YGTVNLLHGVNPGETP VTCTAG IGT FI VEFATLSSL 
TGDPVFEDVARVALMRLWESRSDIGLVGNHIDVLTGKWVAQDAG 
IGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTRFDDW 
YLWVQM YKGTVSM PVFQS LEAYW PG LQSLIGD I DNAMRTFLNYY 
TVWKQFGGLPE FYNIPQGYTVE KREGYPLRPEL I ES AMYLYRAT 
GDPTLLELGRDAVES IE KI SKVECG FATIKDLRDHKLDNRMES F 
FLAETVKYL YLL FDPTNF I HNNGS TFDAVITP YGECI LGAGGY I 
FNTEAHPIDPAALHCCQRLKEEQWEVEDLMREFYSLKRSRSKFQ 
KNTVSSGPWEPPARPGTLFS PENHDQARERKPAKQKVPLLS CPS 
QP FTS KLALLGQVFLDS S * PLDNFFI FIFLRLNYNKLLLAI IKK 
K 


7035 


92 


1942 


EDTS SMP FRLL I P LGLLCALLPQHHGAPGPDGS APDPAHYRERV 
KAMFYHAYDS YLENAFP FDELRPLTCDGHDTWGS FS LTLI DALD 
TLL \ TL F Y FQ I LGNVSE FQRWEVLQDS VDFD I DVNAS VFETNI 
RWGGLLS AHLL S KKAGVEVE AGWP C SGP LLRMAE EAARKL L PA 
FQTPTGMPYGTVNLLHGVNPGETPVTCTAGIGTFI VEFATLSSL 
TGDPVFEDVARVALMRLWESRS D IGLVGNHIDVLTGKWVAQDAG 
IGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTRFDDW. 
YLWVQM YKGTVS M P VFQ S LEAYWPG LQSL I GD I DNAMRTFLNYY 
TVWKQFGGLPE FYN I PQGYTVE KREGYPLRPEL I ES AMYLYRAT 
GD P TLLELGRDAVE S I EK I SKVE CG FAT I KDLRDHKLDNRME S F 
FLAETVK YLYLL FDP TNF I HNNGSTFDAV I TP YGEC I LGAGG Y I 
FNTEAHPIDPAALHCCQRLKEEQWEVEDLMREFYSLKRSRSKFQ 
KNTVSSGPWEPPARPGTLFS PENHDQARERKPAKQKVPLLS CPS 
QPFTSKLALLGQVFLDSS * PLDNFFI FIFLRLNYNKLLLAI IKK 
K 


7036 


442 


751 


CLAPLFSCFQIINLHLAFSGRLRWAWLRGPGRN* LPGEGPS I PT 
RNW*ERKAGCSQPC/PAQQHHGRPPGVSPLPRDPHPTTLRPLPP 
PPPPPPPPPRRPPRNRRPG 


7037 


442 


761 


CLAPLFS CFQI INLHLAPSGRLRWAWLRGPGRN* LPGEGPS I PT 
RNW* ERKAGCSQPC/ P AQQHHGRP PG VS P LPRDPHPTTLRP LP P 
PPPPPPPPPRRPPRNRRPG 


7038 


155 


891 


GAGAASDMSSGLRAADFPRWKRHISEQLRRRDRLQRQAFEE I IL 
Q YNKLL E KS D LHS VLAQ KLQAE KHDVPNRHE I S PGHDGTWNDNQ 
LQEMAQLRIKHQEELTELHKKRGELAQ\RVIDLNNQMQRKDREM 
QMNEAKIAECLQTISDLETECLDLRTKLCDLERANQTLKDEYDA 
LQI TFTALEGKLRKTTEENQELVTRWMAEKAQEANRLNARE * KR 
LQEAAS PAAERACRSSKGTSTSRTG 


7039 


155 


891 


G AG AAS D MS S GLRAAD FPR W KRH I S EQ LRRRDRLQRQAFEE 1 1 L 
QYNKLLEKSDLHS VLAQKLQAE KHDVPNRHE I S PGHDGTWNDNQ 
LQEMAQLRIKHQEELTELHKKRGELAQ\RVIDLNNQMQRKDREM 
QMNEAKIAECLQTISDLETECLDLRTKLCDLERANQTLKDEYDA 
LQ ITFTALEGKLRKTTEENQELVTRWMAE KAQEANRLNARE * KR 
LQEAAS PAAERACRSS KGTSTS RTG 


7040 


34 


789 


KITPPRRPHRCSSGHGSDNSSVLSGELPPAMGKTALFYHSGGSS 
GYESVMRDSEATGSASSAQDSTSENSSSVGGRCRSLKTPKKRSN 
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corresponding 
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Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, OCyateine, D=Aspartic Acid, E= 
Glutamic Acid, F=* Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutarnine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X- Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








PGSQRRRLI PALSLDTSS P VRKP PNS TGVRW VDG PLR S S P RG LG 
EPFE2KVYEIDDVERLQRRRGGASKEAMCFNAKLKILEHRQQRI 
AEVRAKYEWLMKELEATKQYLMLDPNKWLSEFDLEQVWELDSLE 
YLEAL E CVTEiRLES R VNFCKAHLMM I TCFD IT 


7041 


1 


567 


SGRVAMGRRRAPAGGSLGRALMRHQTQRSRSHRHTDSWLHTSEL 
NDGYDWGRLNLQSVTEQSSLDDFLATAELAGTEFVAEKLNIKFV 
P AEARTGL LS FEES Q R I KKLHE ENKQ F LC I P RRPNWNQNTT PEE 
LKQAEKDNFLEWRRQL\VRLEEEQKLILTPF'ERNLDFWRQLWRV 
IERSDIWQIVDA 


7042 


7 


345 


P I HMAAAALRAD I\ISPLFPHI QGYLLLS AS HG \ ATSLHTKGAL 
PLETVTMYTVI PKS KYVLVKPDTQYPYS ENLDEFKRLAENSASN 
DDLLMAEVAI SDYGDKLTLELREKY 


7043 


2 


2170 


ARGMAARDSDSEEDLVSYGTGLEPLEEGERPKKPIPLQDQTVRD 
EKGRYKRFHGAFSGGFSAGYFNTVGSKEGWTPSTFVSSRQNRAD 
KS VLGPED FMDEEDLSE FG I APKAI VTTDDFASKTKDR I RE KAR 
QLAAATAP I PGATLLDDLITPAKLS VGFELLRKMGWKEGQGVGP 
RVKRRPRRQKPDPGVKIYGCALPPGSSEGSEGEDDDYLPDNVTF 
APKDVTPVDFTPKDNVHGLAYKGLDPHQALFGTSGEHFNLFSGG 
SERAGDLGEIGLNKGRKLGISGQAFGVGALEEEDDDIYATETLS 
KYDTVLKDEEPGDGLYGWTAPRQYKNQKESEKDLRYVGKILDGF 
SLASKPLSSKKIYPPPELPRDYRPVHYFRPMVAATSENSHLLQV 
LSESAGKATPDPGTHSKHQLNASKRAELU3ETPIQGSATSVLEF 
LSOKDKER I KEMKQATDLKAAQLKARSLAQNAQSSRAQPS PAAA 
AGHCSWNMALGGGTATLKASNFKPFAKDPEKQKRYDEFLVHMKQ 
GQKDALERCLDPSMTEWERGRERDEFARAALLYASSHSTLSSRF 
THAKEEDDSDQVEVPRDQENDVGDKQSAVKMKMFGKLTRDTFEW 
HPDKLLFQ / RLVGLPRVKRDKYS VFNFLTLPETASLPTTQAS S E 
KVSQHRG PDKSRKPSRWDTSKHEKKEDS I S E FLRLARS KAEPP K 
QQSSPLVNKEEEHAPELSAN 


7044 


276 


734 


EVYLTDEFAKGRKVADLYELVQYAGNIIPRLYLLI'TVGWYVKS 
FPQSRKDILKDLVEMCRGVQHPLRGLFLRNYLLQCTRNILPDEG 
EPTDEETTGDISDSMDFVLLNFAEMNKLWVRMQHQGHSRDREKR 
ERERQELR IL VGTNLVRLSQ V 


7045 


3 


513 


LGFKMEALSRAGQEMSLAALKQHDPYITSIADLTGQVALYTFCP 
KANQWEKTD I EGTLFVYRRSAS P YHGFTIVNRLNMHNLVE PVNK 
DLE FQ LHE P FLL YRNAS LS I YS I WF YD KND CHR I AKLMAD WE E 
ETRRSQQA/ RSGQTESQPGQWLQRPQAHRHPGDAEQSQG 


7046 


3 


513 


LGF KME AL S RAGQEMS LAALKQHD P Y I TS IADLTGQVAL YTFC P 
KANQWEKTD I EGTLFVYRRSAS PYHGFTIVNRLNMHNLVEPVNK 
DLEFQLHEPFLLYRNASLS I YS IWFYDKNDCHRIAKLMADWEE 
ETRRSQQA/RSGQTESQPGQWLQRPQAHRHPGDAEQSQG 


7047 


103 


486 


QMKI E KCGWSEGLTS I KGNCHNFYTAI S KDVT YKELKNLLNS KN 
IMLIDVRE I WE I LEYQKI PES INVPLDEVGEALQMNPRDFKEKY 
NEVKPSKSDS / IVFSYLAGVRSKKALDTAISLGFHS YYER 


7048 


92 


627 


FFCLTLLSS WDYRHHATRRVI SSP VFTMEDSGKTFS SEEEEANY 
WKDLAMTYKQRAENTQEELREFQEGSREYEAELETQLQQIETRN 
RDLLSENNRLRMELETIKEKFEVQHSEGYRQISALEDDLAQTKA 
I KDQLQKY I RELEQANDDLERAKRATDHGLSKTFE \QRLN\QAI 
EKKW 


7049 


393 


938 


KRTGS AS YGG P P PGLGG PATXASVAGRCSS VGK I PARRC YEDEL 
VPVFEAVGR I YELRLMMDFDGKNRG YAFVMYCHKHEAKRAVREL 
NNYE I RPGRLLG VCCS VDNCRLFIGG I PKMKKREE I LEE I AKVT 
EGVLDVIVYASAADKMKNRGLRLRGVREPPRGCHWLGRKLIAWX 
ASSLWG 


7050 


393 


938 


KRTGSASYGGPPPGLGGPATXASVAGRCSSVGKIPARRCYEDEL 
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ID 
NO: 
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beginning 
nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C-Cysteine, D*Aspartic Acid, E= 
Glutamic Acid, F^Phenylalanine, G«Glycine, 
H=Histidine, Ialsoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q*Glutamine, R=Arginine, 
S^Serine, T=s Threonine , V^Valine, 
W«Tryptophan, Y-Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\*possible nucleotide insertion) 








V P V FEAVG R I YELRLMMDFDGKNRGYAF VM Y CHKHEAXRAVREL 
NNYEIRPGRLLGVCCSVDNCRLFIGGIPKMKKREEILEEIAKVT 
EGVLDVIVYASAADKMKNRGLRLRGVREPPRGCHWLGRKLIAWX 
ASSLWG 


7051 


119 


816 


KKMNLAE I CDNAKKGREYALLGNYDSSMVYYQGVMQQIQRHCQS 
VRDPAIKOKWQQVRQELLEEYEQVKSIVGTLESFKIDKPPDFPV 
S CQD E P FRD P AVWP P P VP AEHRAP PQ I RR /RQSRS KTSE ERNGR 
SRS PGTCR PS T\ PI S KS EKPS TS RDKD YRARGRDDKGRKNMQDG 
AS DGEMP KFDGAGYDKDLVEALERD I VSRNPS IHWDDI ADLEE A 
KKLLREAGVLPMWM 


7052 


467 


715 


SCPGRGKMSKLLNPEEMTSRDYYFDSYAHFGIHEEMLKDEVRTL 
TYRNSMYHNKHVFKDKWLDVGSGTGILSMFAARQGPRR 


7053 


467 


715 


SCPGRGKMSKLbNPEEMTSRDYYFDSYAHFGIHEEMLKDEVRTL 
T YRNS MYHNKHVFKD KVVLDVGS GTG 1 LS M F AARQGP RR 


7054 


1 


1036 


GTSQRSRETDARRRSAGAEPTARLPWPAALEEWPSCPCEPLGPG 
RRCRWDAMEYDEKLARFRQAHLNPFNKQSGPRQHEQGPGEEVPD 
VTPEEALPELPPGEPEFRCPERVMDLGLSEDHFSRPVGLFLASD 
VQQLRQAIEECKQVILELPEQSEKQKDAWRLIHLRLKLQELKD 
PNEDEPNIRVLLEHRFYKEKSKSVKQTCDKCNTIIWGLIQTWYT 
CTGCYYRCHS KCLNL I S KPCVSS KVSHQAE YELNI CPETGLDSQ 
DYRCAECRAPI/CS/DGWPSEARQCDYTGQYYCSHCHWNDLAV 
I P AR WHNWD FE PRKVS RCS MR YLALMVSR P VLRLRE IN 


7055 


2 


527 


DSRRVSWRSWLANE/WGKHLCLFIWLSMNVLLFWKTFLLYNQGP 
EYHYLHQMLG/ALCLSRASASVLNLNCSLILLPMCRTLLAYLRG 
SQKVPSRRTRRLLDKSRTFHITCGATICIFSGVHVAAHLVNAIjN 
FSVNYSEDFVELNAARYRDEDPRKLLFTTVPGLTGVCMEWLFL 
M 


7056 


2 


527 


DSRRVSWRSWLANE/WGiCHLCIjFIWLSMNVLLFWKTFLLYNQGP 
EYHYLHQMLG/ALCLSRASASVLNLNCSLILLPMCRTLLAYLRG 
SQKVPSRRTRRLLDKSRTFHITCGATICIFSGVHVAAHLVNALN 
F S VNY S E D FVE LNAAR YRDED PRKLL FTT VPGLTG VCMEWL FL 
M 


7057 


1368 


431 


GIYLHVNEKIPRPTCIGDRQENDKENLNLENHRDQELLHASCQA 
SGEVPSQASLRGFFTEDEPGCFGEGENLPEALQNIQDEGTGEQL 
SPQERISEKQLGQHLPNPHSGEMSTMWLEEKRETSQKGQPRAPM 
AQKLPTCRSCGKTFYRNSQLI FHQRTHTGETYFQCTI CKKAFLR 
SSDFVKHQRTHTGE KPCKCD YCGKGFSDFSGLRHHEK IHTGE KP 
YKCPICEKSFI QRSNFNRHQRVHTGEKP YKCSHCG KS FS W S S S L 
D KHQRSHLGKKP FQ * P VTKLS FP IS ISQPSHKNTQLHQEEL CLR 
GYPC 


7058 


1 


469 


FSGFGAVPDALGCRMSDLR I TEAFLYMDYLCFRALCCKGPP PAR 
PEYDLVCIGLTGSGKTSIiLSKLCSESPDNWSTTGFSIKAVPFQ 
NAI LNVKELGGADN I RKYWSRYYQGSQGVI FVLDS ASS EDDLEA 
ARN*SCTQLLQHPQLCTLPFLILA 


7059 


1 


1178 


WPAFPRQ PAAAAMDALLGTG P RRARGCLGAAG PTS SGRAART PA 
AP WARFS AWLECVCWTFDLELGQALELVYPNDFRLTDKEKS S I 
CYLSFPDSHSGCLGDTQFSFRMRQCGGQRSPWHADDRHYNSRAP 
VALQREPAHYFGYVYFRQVKDSSVKRGYFQKSLVLVSRLPFVRL 
FQALLSLIAPEYFDIOJVPCLEAVCSEIDQWPAPAPGQTLNLPVM 
G WVQVR I P SRVDKSES S PPKQFDQENLLPAP WLAS VHELDLF 
RCFRP VLTHMQTLWELMLLGE PLLVLAPS PDVS SEMVLALTS CL 
QPLRFCCDFRPYFTIHDSEFKBFTTRTQAPPNWLGVTNPFFIK 
TLQHWPH I LRVGEP KMSGDLPKQVKLKKPFKV* RPWDTKP 


7060 


90 


1670 


SVNLPPSLWPWEEAMDSTKSEPLKGSPEAEDGNIEYKKLVNPSQ 
YRFEHLVTQMKWRLQEGRGEAVYQIGVEDNGLLVGLAEEEMRAS 
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ID 
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beginning 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
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location 
corresponding 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, MsMethionine, N=Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S^Serine, ^Threonine, V^Valine, 
W=Tryptophan, Y-Tyrosine, X-Unlcnown, *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








LKTLHRMAE KVGAD I T VLREREVD YDSDM P R KI TE VLVRKVPDN 
QQFLDLRVAVLGNVDSGKSTLLGVLTQGELDNGRGRARLNLFRH 
LHE I QSGRTSS I S FE I LGFNSKGEVHG I NGTQWGQTLRMGW * + * 
RT* DGGRVWRLFEIV*MNALRGL*TSSAPLRKSMGNQLN* IKNG 
VKI KRQGHPGNGLG PGNSEG VGRAGRRH * G P WALGQ WNYSDS R 
TAEEICESSSKMITFIDLAGHHKYLHTTIFGLTSYCPDCALLLV 
SANTG I AGTTREHLGLAIjALKVPF F I WS KI DLCAKTTVERTVR 
QLERVLKQPGCHKVPMLVTSEDDAVTAAQQFAQSPNVTPIFTLS 
S VSG ES LDLLKVFLNI LPPLTNSKEQEE LMQQLTEFQVDE I YTV 
PEVGTWGGTLSR* IDLLATLPTQPSPIYSKTSWPKGGDPGI 


7061 


364 


710 


ARMPSPLGPPCLPVMDPETTLEEPETARLRFRGFCYQEVAGPRE 
ALARLRELCCQWLQPEAHSKEQMLEMLVLEQFU3TLPPEIQAWV 
RGQRPGSPEEAAALVEGLQHDP*ARMPSPLGPPCLPVMDPETTL 
EEPETARLRFRGFCYQEVAGPREALARLRELCCQWLQPEAHSKE 
QMLEML VLEQ FLGTL PPEI QAWVRGQRPGS PEEAAALVEGLQHD 
PGQLLG 


7062 


71 


744 


AKAGTNLERLHWLSYFFCIPKHKLKSSQKDKVRQF^4ACTQAGER 
TAIYCLTQNEWRLDEATDSFFQNPDSLHRESMRNAVDKKKLERL 
YGRYKDPQDENKIGVDGIQQFCDDLSLDPASISVLVIAWKFRAA 
TQCEFSRKEFLDGMTELGCDSMEKLKALLPRLEQELKDTAKFKD 
FYQFTFTFAKNPGQKGLDL*MAGAYWKLVLSGRFKFLYLWNTFL 
MEHH 


7063 


2 


562 


LR T V PDLPGRK FRAMR TGQRR * PE LP PDMNS LEQAEDLKA FERR 
LTEYIHCLQPATGRWRMLLIWSVCTATGAWNWLIDPETQKVSF 
FTSLWNHPFFTISCITLIGLFFAGIHKRWAPSIIAARCRTVLA 
EYNMSCDDTGKLILKPRPHVQ*QSSLIVMGLKIAFLRISDTAKS 
HKGFLLRLDM 


7064 


300 


884 


RDTGSDPSSTRRLCSTCCTGH*PAEPIASPHPSRGTCPPASSAS 
SRRTGCWTCPPESGHAQARRSRRASASRWGARGAVRSAVAARGC 
S SRAGRWLETPGRRRGP PACAAAAGRLRGPAP * AAPPTAS VPAR 
CRC P AARTGAP AAATWLRRRLSGLRAP ALGRRRS PG P S P KS AAP 
PLLTPLGAGRAGGSRANS 


7065 


1 


555 


ATTTHSARRSGRGAAAEAAASAAGGRQKGPDRKAWEGRRTTPGG 
RSQSEPKAPPPQKRSEAAFASMAHSPVAVQVPGMQNNIADPEEL 
FTKLERI GKGS FGE VFKG I DNRTQQ WAI KI I DLEEAE DE I E D I 
QQE ITVLSQCDS S YVTKYYGS YLKGS KLWI I ME YLGGGSALDLL 
RAGPFDEFQ 


7066 


356 


676 


PGPQRGPWRAREGGHPLDPADHPRAPASLRSNVRAATMMQICDT 
YNQKHSLFNAMNRFIGAVNNMIX3TVMVPSLIiRDVPLADPGLDND 
VGVEVGGSGGCLEERTPP 


7067 


152 


973 


KEN I TMATE IGS P PRFFHMPRFQHQAPRQLFYKRPDFAQQQAMQ 
QLTFDGKRMRKAVNRKTIDYNPSVIKYLENRIWQRDQRDMRAIQ 
PDAG Y YNDLVPP IGMLNNPMNAVTTKFWTS TNKVKCP VFWRW 
T PEGRRLVTGAS SGEFTLWNGLTFNFETILQAHDS P VRAMTWSH 
NDMWMLTADHGG YVKYTtf QSNMNNVKMFQAHKEAI REARF I HN I P 
FS WP I VMVKLFSXCILGAEMHGLCQFLGNFLHP INTI FFFVFT 
HSPFCWAPF 


7068 


222 


816 


DTMKEYVLLLFLALCSAKPFFSPSHIALKNMMLKDMEDTDDDDD 
DDDDDDDDDDEDNSLFPTREPRSHFFPFDLFPMCPFGCQCYSRV 
VHCSDLGLTS VPTN I PFDTRMLDLQNNKI KEI KENDFKGLTSLY 
GLILNNNKLTKIHP KAFLTTKKLRRL YLSHNQLS E I PLNLP KS L 
AELR I HENKVKKI QKDTFKKK 


70*9 


1147 


1765 


FRDHRRYFYVNEQSGESQWEFPDGEEEEEESQAQENRDETLAKQ 
TLKD KTG TDSNS TE S S ETS TGS LCKE S FS GQVS S SS LMPLT P FW 
TLLQSNVPVLQPPLPLEMPPPPPPPPESPPPPPPPPPAPKMPPP 
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to first ! 
amino acid 
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amino acid 
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Predicted end 
nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, OCysteine, D-Aspartic Acid, E= 
Glutamic Acid, F- Phenyl al anine , G=Glycine, 
H»Hiatidine, I=Isoleucine, K=Lysine, 
L=Leucine t M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y= Tyrosine , X*Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








E KTKXGRKDKAKKS KTKMPS LVKKWQS IQRELDEEDNS S S S EED 
RVSTAQKRIEEWKQQQLVSGMAERNANFEA 


7070 


1 


547 


DGTMEDSEAVQRATALIEQRLAQEEENEKLRGDARQKLPMDLLV 
LEDEKHHGAQS AAIjQK VKGQER VRKTS LDLRREI IDVGG IQNL I 
ELRKKRKQKIOlDAIiAASHEPPPEPEEITGPVDEETFLKAAVEGK 
MKV I E KFLADGGS ADT C DQ FRRTALHRAS L EGHMEI LE KLLDNG 
ATVDPQ 


7071 


2 


921 


ARGTLRALETAXKVGKVGANGQKAAGPSADSVTENKIGSPPKTP 
VSNVAATSAGP SNVGTELNS VPQKS S P FLTRVPAY PPHSEN I QY 
FQD PRTQ I P F EVP Q YPQTG Y Y P P P P TV P AGVAPC VPRFVRS NNV 
PESSLPPASMPYADHYSTFSPRDRMNS S PYQPPPPQPYGPVPPV 
PSGMYAPVYDSRRIWRPPMYQRDDI I RSNSLPPMDVMHSSVYQT 
SLRERYKSLDGYYSVACQPPSEPRTTVPLPREPCGHLKTSCEEQ 
IRRKPDQWAQ YHTQ KAP L VS S TL P VATQS PT P P S TLNRGEGS 


7072 


2 


921 


ARGTLRALE TAKKVGKVG ANGQKAAG P S ADS VTENKIGS P PKTP 
VSNVAATS AG P S NVGTE LN S VPQ KS S P FLTR VPAYP PHSENI Q Y 
FQDPRTQ I PFE VPQYPQTG YY P PP PTVPAGVAPCVPRFVRSNNV 
PESSLPPASMPYADHYSTFSPRDRMNSSPYQPPPPQPYGPVPPV 
PSGMYAPVYDSRRIWRPPMYQRDDI IRSNSLPPMDVMHSSVYQT 
S LRER YNS LDG Y Y S VACQ P P S E PRTT VP LP RE P CGHLKTS C EEQ 
IRRKPDQWAQYHTQKAPLVSSTLPVATQSPTPPSTLNRGEGS 


7073 


50 


504 


LAHGSFGVSDFPAPAAAPAHTLTSFSGSLSPQFRKPLGRAPAMP 
LVR YRKWI LG YRCVGKTS LAHQFVEGEFSEG YDPT VENTY SKI 
VTLGKDEFHLHIiVDTAGQDE YS ILP YS F I IGVHGYVLVYSVTSL 
HSFQVIESLYQKLHEGHGK 


7074 


263 


1003 


VCPVLCSTRQEPGHSSLVTYFGKPTRRKEFLLGHCIAAGKMNIS 
VDLETNYAELVLDVGRVTLGENSRKKMKDCKLRKKQNERVSRAM 
CALLNSGGGVI KAE IENEDYS YTKDG IGLDLENS FSNI LLFVPE 
YLDFMQNGNYFLIFVKSWSLNTSGLRITTLSSNLYKRDITSAKV 
MNATAALEFLKDMKKTRGRLYLRPELLAKRPRVDIQEENNMKAL 
AGVFFDRTELDRKEKLTFTESTHVEI 


7075 


598 


1005 


NYINFFFRKEYPPHVQKVEINPVRLSRLQGVERIMKKTEESESQ 
VEPEI KRKVQQKRHCSTYQPTPPLSPAS KKCLTHLEDLQRNCRQ 
AI TLNES TG PLLRTS I HQNS GG Q KS QNTGLTTKKF YGNNVEKVP 
IDII 


7076 


279 


1049 


LQSESSNAAEGNEQRHEDEQRSKRGGWSKGRKRKKPLRDSNAPK 
S PLTGYVRFMNERREQLRAKRPEVPFPE ITRMLGNEWSKLPPEE 
KQRYLDEADRDKERYMKELEQYQKTEAYKVFSRKTQDRQKGKSH 
RQDAARQATHDHE KETE VKERS VFD I P I FTE E FLNHS KARE AEL 
RQLRKSNME FEERNAALQKHVE S MR TAVE KLE VD V IQERSRNTV 
LQQHLETLRQVLTSSFASMPLPEXGETPTVDTIDSYM 


7077 


3 


1119 


SSMGSNSEINGLALRKTDKYGFLGGSQYSGSliKSSIPVDVARQR 
ELKWLDMFSNWDKWLS RRFQ KVKLRCRKG IPS SLRAKAWQYLSN 
S KELLEQNPRKFEELERAPGDPKWLDV I EKD LHRQ FP FHEMFAA 
RGGHGQQDLYRI LKAYT I YRPDEGYCQAQAP VAAVLLMHMPAEQ 
AFWCLVQICDKYLPGYYSAGLRAIQLDGKIFFALLRRASPLAHR 
HLRRQR I DP VL YMTE W FM C I FARTL P WAS VLRVWDMFFCEGVK I 
I FRVALVLLRHTLGS VE KLRS CQGM YETMEQLRNLPQQCMQEDF 
LVHEVTNLPVTEALIERENAAQLKKWRETRGELQYRPSRRLHGS 
RAIHEERRRQQPPLGPSSS 


7078 


483 


767 


FQGQRMAGEQKPSSNLLEQFILLAKGTSGSALTALISQVLEAPG 
VYWGELLELANVQELAEGANAAYLQLLNLFAYGTYPDYIANKE 
SLPELY 


7079 


2 


376 


SWEFKRPKEPSGSDGESDGPIDVGQEGQLSQMARPLSTPSSSQ 
MQAR KKRRG 1 1 EKRRRDR I NSS LS ELRRL VP TAFE KQGSS KLEK 
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Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L-Leucine, M=Methionine , N^Asparagine, 
P«Proline, Q^Glutamine, R*Arginine, 
S^Serine, T«Threonine , VsValine, 
W=Tryptophan, Y»= Tyrosine, X- Unknown , *«Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








AE VLQMTVDHLKMLHATGGTGTHALLFQASFIQQ I F 


7080 


200 


595 


VQLPLEAPCLS LLS CRDHSGGNRDLSRRHRDCRVYGS PQDG I PY 
LTHPLCHQDWSVGRLQIRALATPGHTQGHLVYLLDGEPYXGPS 
CLFSGDLLFLSGCGEFPRKREELGEEGETEVRAATVPWRALKP ' 


7081 


213 


506 


AVTEEEM I LNS LSLCYHNKLI LAPMVRVGTLPMRIiLALDYGADI 
VYCEELIDLKMIQCKRWNEVLSTVDFVAPDDRWFRTCEREQN 
RWFQMGTS 


7082 


3 


1137 


' APSRNTNLMAWCRGPVLLCLRQGLGTNSFLHGLGQEPFEGARSL 
CCRSSPRDLRDGEREHEAAQRKAPGAESCPSLPLSISDIGTGCL 
SSLENLRLPTLREESSPRELEDSSGDQGRCGPTHQGSEDPSMLS 
QAQSATE VEERHVSPS CS TSRERPFQAGEL I IiAETGEGETKFKK 
LFRLNNFGLLNSNWGAVPFGKIVGKFPGQILRSSFGKQYMLRRP 
ALEDYWLMKRGTAITFPKDINMIIiSMMDINPGDTVLEAGSGSG 
GMSLFLSKAVGSQGRVISFEVRKDHHDLAKKNYKHWRDSWKLSH 
VEEW PDNVD F I HKD I SGATED I KS LT FDAVALDMLN PHVTLP VF 
YPHLKHGGVCP VYWNI TQVI ELLD 


7083 


115 


541 


'RSNAVQLTRMEYAMKSLSLLYPKSLSRHVSVRTSWTQQLLSEP 
SPKAPRARPCRVSTADRSVRKGIMAYSLEDLLLKVRDTLMIADK 
PFFLVLEEDGTTVETEEYFQALAGDTVFMVLQKGQKWQPPSEQG 
TRHPLSLSHK 


7084 


3 


522 


NSVSVSSQSRFLASVPGTGVQRSAAADMAASTAAGKQRIPKVAK 
VKNKAPAE VQ ITAEQLIiREAKERELELLP PPPQQKITDEEELND 
YKLRKRKTFEDNIRKNRTVISNWIKYAQWBESLKEIQRARSIYE 
RALDVDYRNITLWLKYAEMEMKNRQVNHARNIWDRAITTL 


7085 


243 


1499 


RQLARLRRRGWRSPFGGAPMAHITINQYLQQVYEAIDSRDGASC 
AELVSFKHPHVANPRLQMASPEEKCQQVLEPPYDEMFAAHLRCT 
YAVGNHDF I EAY KCQTV I VQS F LRAFQAHKE ENWAL P VMYAVAL 
DLRV FANNADQQLVKKG KS KVGDMLE KAAE LLMS CFRVCAS DTR 
AGIEDSKKWGMLFLVNQLFKIYFKINKLHLCKPLIRAIDSSNLK 
DDYSTAQRVTYKYYVGRKAMFDSDFKQAEEYLSFAFEHCHRSSQ 
KNKRM I L I YLLP VKMLLGHM PTVE LLKKYHLMQFAEVTRAVS EG 
NLLLLHEALAKHEAFF I RCG I FL I LEKLKI ITYRNLFKKVYLLL 
KTHQLS LDAFLVALKFMQVEDVD IDEVQCILANL I YMGHVKGYI 
SHQHQKLWSKQNPFPPLSTGC 


7086 


256 


525 


ILAARMGKQNS KLR P E VMQD LLE S TD FTEHE IQE WYKGFLRDCP 
SGHLSMEEFKKIYGNFFPYGDASKFAEHVFRTFDANGDGTIDFR 
EF 


7087 


166 


723 


LSGS SAGKVAAPCVPPSNHELVP I TTENAPKNWDKGEGASRGG 
NTR KS LEDNG S TR VTP S VQPHLQP I RNMS VS RTMEDS CELDLVY 
VTER 1 1 AVS FPS TANEENFRSNLRE VAQMI*KSKHGGNYLLFNLS 
ERRPDITKLHAKVLEFGWPDLHTPALEKICSICKAMDTWLNAHP 
HRCRVLHNKG 


7088 


104 


759 


GTSAAS PSS LLEMAGE ITETGELYS S YVGLVYMFNLI VGTGALT 
MPKAFATAGWLVSLVJ^V t jjtir MirM ill r v JLxiAfaAAAiN/iyijnn 
KRMENLKEEEDDDSSTASDSDVLIRDNYERAEKRP IliSVQRRGS 
PNPFEITDRVEMGQMASMFFNKVGVNLFYFCIIVYLYGDLAIYA 
AAVP FSIjMQVTCS ATGNDS CG VEADT KYNDTDRCWG PLRRVD 


7089 


33 


1775 


SVCWEDRYLKARMEESPLSRAPSRGGVNFLNVARTY I PNTKVEC 
H YTLPPGTMP S AS DW I G I FKVE AACVRD YHTF VW S S VPE STTDG 
S P I HTS VQ FQ AS YL P KPGAQL YQFR YVNRQGQ VCGQS PPFQFRE 
PRPMDELVTLEEADGGSD I LLWPKATVLQNQLDE SQQERNDLM 
QLKLQLEGQVTELRSRVQELERALATARQEHTELMEQYKGISRS 
HGEITEERDILSRQQGDHVARILELEDDIQTISEKVLTKEVELD 
RLRDTVKALTREQ E KLLGQLKE VQADKEQ SE AELQ VAQQENHHL 
NLDLKEAKS WQEEQ S AQAQRL KD KVAQMKDTLGQAQQRVAELE P 
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Amino acid segment containing signal peptide 
(A«Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P-Proline, Q^Glutamine, R^Arginine, 
S^Serine, T*Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LKEQL.RGAQELAASSQQKATLLGE EliAS AAAARDRT I AELHRSR 
LEVAEVNGKLAELGLHLKSEKCQWSKERAGLLQSVEAEKDKILK 
LS AEILRLEKAVQEERTQNQV FKTELAREKDS S LVQLSES KRE fl 
T ELRS ALRVLQ KEKEQLQE EKQELLE YMR KLE ARLE KVADE KWN 
EDATTEDEEAAVGLSCPAALTDSEDESPEDMRLHPMAFVSVETQ 
ASLLLGLE 


7090 


33 


1775 


S VCWEDRYLKARMEES PLSRAPSRGGVNFLNVARTYI PNTKVEC 
HYTLP PGTM P S AS DW I G I FKVE AACVRD YHT FVWSS VP E S TTDG 
S P I HTS VQ FQAS YLP K P G AQL Y Q FR YVNRQGQ VCGQS P P FQFR E 
PRPMDELVTLEEADGGS D I LLWP KATVLQNQLDESQQERNDLM 
QLKLQLEGQVTELRSRVQELERALATARQEHTELMEQYKGISRS 
HGE I TEERDI LSRQQGDHVAR I LELEDD IQTI S EKVLTKE VELD 
RLRDTVKALTREQEKLLGQLKE VQADKEQS EAELQVAQQENHHL 
NLDLKEAKSWQEEQSAQAQRLKDKVAQMKDTLGQAOQRVAELEP 
LKEQLRGAQELAASSQQKATLLGEELASAAAARDRT I AELHRSR 
LErVAEVNGKLAELGLHLKEEKCQWSKERAGLLQSVEAEKDKILK 
LSAEILRLEKAVQEERTQNQVFKTELAREKDSSLVQLSESKREL 
TELRSALRVLQKEKEQLQEEKQELLEYMRKLEARLEKVADEKWN 
E DATTEDE EAAVGLS C P AALTDS E D E S PEDMRLHPMAFVS VETQ 
ASLLLGLE 


7091 


186 


1076 


EGMLTREHRCGRSEEQELEPWPSPKKARSGRWLRNGFKRKMEEP 
EEPADSGQSLVPVYIYSPEYVSMCDSLAKIPKRASMVHSLIEAY 
ALHKQMRIVKP KVASME EMATFHTDAYLQHLQKVSQEGDDDHPD 
S I E YG LG YD C PATEG I FD YAAAI GGAT I TAAQ C L I DGM CK VAIN 
WSGG WHHAKKDEASGFCYLNDAVLG I LRLRRKFERILYVDLDLH 
HGDGVE DAFS FTS KVMTVS LHXFS PGFFPGTGDVSDVGLGKGRY 
YS VNVP 1 QDG I QDEKY YQ I CER YE PPAPNPG L 


7092 


522 


809 


KQGINEDQEESQKPRLGEGCEPISKRQMKKLIKQKQWEEQRELR 
KQKRKE KRKR KKLERQCQME PNS DGHD R KR VRRD WHS TL RL 1 1 
DCSFDXLM 


7093 


454 


655 


NFG VS G V E LAQQ ASMVRM S F V IAACQL VLG LLMTS LTE SSI QNS 
ECPQLCVCEIRPWFTPQSTYREA 


7094 


2 


508 


FVRSMHWGVGFASSRPCWDLSWNQS ISFFGWWAGSEEPFS FYG 
DI IAFPLQDYGGIMAGLGSDPWWKKTLYLTGGALLAAAAYLLHE 
LLVI RKQQE I DS KDAI I LHQFARPNNG VPSLS P FCLKNET YLRM 
ADLPYQNYFGGKLSAQGKMPWIEYNHEKVSGTEFI I 


7095 


1 


411 


IASSLPKMASLLQSDRVLYLVQGEKKVRAPLSQLYFCRYCSELR 
SLECVSHEVDSHYCPSCLENMPSAEAKLKKNRCANCFDCPGCMH 
TLSTRATSISTQLPDDPAKTTMKKAYYLACGFCRWTSRDVGMAD 
KSVGE 


7096 


224 


2067 


ETRSLAVQEKPSQAGRRRSSRISFAGALFLTRFLLQELLLNNFC 
SAMSPAPDAAPAPAS ISLFDLSADAPVFQGLSLVSHAPGEALAR 
APRTSCSGSGERESPERKLLQGPMDISEKLFCSTCDQTFQNHQE 
QREH Y KLDWHRFNL KQRL KD KPLLS ALD FEKQS STGDLS S I S G S 
EDSDS AS E EDLQTLDRE RATFEKL SRP PG FYPHR VLFQNAQGQ F 
LYAYRCVLGPHQDPPEEAELLLQNLQSKGPRDCVVLMAAAGHFA 
GAI FQG R E WTHKT FHR YTVRAKRGTAQGLRDARGG PSHS AG AN 
LRR YNE ATLYKD VRDLLAG P S WAKALE EAGT I LLRAPRS GR S L F 
FGGKGAPLQRGDPRLWDI PLATRRPTFQELQRVLHKLTTLHVYE 
EDPREAVRLHSPQTHl»mTVREERKKPTEEEIRKICRDEKEALGQ 
NEESPKQGSGSEGEDGFQVELELVELTVGTLDLCESEVLPKRRR 
RKRNKKEKSRDQEAGAHRTLLQQTQEEEPSTQSSQAVAAPLGPL 
LDEAKAPGQ P E LWNALLAACRAGDVG VLKLQ LAP S P ADP R VLS L 
LSAPLGSGGFTLLHAAAAAGRGSWRLLLEAGADPTVQCQDH 


7097 


256 


1228 


IRTKSAATWEAWPQCGREGSRIITEPCEANAGSRQELQTERISS 
FLAAQGDQAFHSGLETNNSNSELPLRVGLKVAQGSPLMGGQVSA 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparaglne, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








SNS FSRLHCRNANEDWMS ALCPRLWDVPLHHLS I PGSHDTMTYC 
LNKKS P I SHEE S RLLQLLNKALPC I TRPWLKWSVTQALDVTEQ 
LDAG VRYLDLR I AHMLEGSEKNLHFVHMVYTTALVEDTLTE ISE 
WLERH PRE W I LACRNFEGLSEDLHEYLVACI KNI FGDMLCPRG 
EVPTLRQLWSRGQQVIVSYEDESSLRRHHELWPGVPYWWGNRVK 
TEAL I R YLETMKS CGR 


7098 


82 


956 


SSFLKRCRKVLGCWGIPSEQSLFSTLEEPRDKEIDNYCVMRLQT 
EARSGFWAPNRFPVN I CRMTAVDGDRGGSSRETCRCHFHPSLEA 
LVLLLQDWQPGGVGI CTS FLG I S WALLDYHRALRTCLPS KPLLG 
LGSSVI YFLWNLLLLWPRVLAVALFSALFPSYVALHFLGLWLVL 
LLWVWLQGTDFMPDPSSEWLYRVTVATILYFSWFNVAEGRTRGR 
AI IHFAFLLSDS I LLVATWVTHSSWLPSGIPLQLWLPVGCGCFF 
LGLALRLVYYHWLHPSCCWKPDPDQVD 


7099 


992 


210 


LFRLAPGFLRSLARQGYHQIWAFPFLPSGATATWPAASRSRSLA 
ARSLPRSPARPGPNDALLGEHDFRGQGVRAQRFRFSEEPGPGAD 
GAVLEVHVPQIGAGVSLPGILAAKCGAEVILSDSSELPHCLEVC 
RQS CQMNNLPHLQ WGLTWGHIS WDLLALP PQD I ILASDVFFEP 
E D FED I LAT I YF LMHKN P KVQLWSTYQ VRSADWSLEALL Y KWDM 
KCVHIPLESFDADKEDIAESTLPGRHTVEKLVISFAKDSL 


7100 


205 


671 


ANGG FWEAAPGS EVSLPLWVPTASHS KTTALQ IGSAPPPHLS VL 
FLFSFPPQLGDPLEAFPVFKKYDRNGLNVSIECKRVSGLEPATV 
DWAFDLTKTNMQTMYEQSEWGWKDREKREEMTDDRAWYLIAWEN 
SSVPVAFSHFRFDVERGDEVLYW 


7101 


2 


503 


WRGGPRPJUOlIiAGGAVGWVLLVRGVHSVRAGGGRPPRAADMKKD 
VRILLVGEPRVGKTSLIMSLVSEEFPEEVPPRAEEITTPADVTP 
ERVPTHIVDYSEAEQSDEQLHQEISQANVICIVYAVNNKHSIDK 
VTSRW I PLINERTDKDSRLPLILGGNKS DLVE YSR 


7102 


2 


503 


WRGGPRRAKRLAGGAVGW VLLVRGVHS VRAGGGRP PRAADMKKD 
VR I LLVGE PRVGKTSL I MS LVSE EFPEE VPPRAEE I T I PAD VTP 
ERVPTH I VDYS EAEQSDEQLHQE I SQANVIC I VYAVNNKHS I DK 
VTSRWIPLINBRTDKDSRLPLILGGNKSDLVRYSR 


7103 


119 


438 


GSQSS VAVN I RSGTDEE SMDLMNGQT^SS VNIAATAS EKS SS SES 
LSDKGSELKKSFDAWFDVLKVTPEEYAGQITLMDVPVFKAIQP 
DELS SCGWNKKEKYS SAP 


7104 


1670 


795 


RLWEHRS VSAGASGWGLSS PGCLLLHPSLPEEERVD I LINKAGV 
MRC PH WTTEDG FEMQFG VNHLG EAW AG AAPWVQAI LP RR P P KVL 
GF*V*VKSDLFIILNPGHFLLTNLLLDKLKASAPSRIINLSSLA 
HVAGHIDFDDLNWQTRKYNTKAAYCQS\KLAIVLFTKELSRRLQ 
GSGVTVNALHPGVARTELGRHTG IHGS TFLQHHN\ WAHLLAAWS 
KSPRSWPAPAQHNTLAVAEELA\VISGKYFDGLKQKAPAPEAED 
EEVARRLWAESARLVGLEAPSVRBQPLPR 


7105 


765 


143 


GQMCRRPSPKSTSCLSMTCDLP/RGLQDPQCLALFRVAVDKHQA 
LLKAAMS GQG VDRHL FAL Y I VS RFLHLQS P FLTQVHS EQWQL S T 
bQ 1 t? VOQMH L e u VHN YPD YVS SGGG FG P ADDHGYGVSYIFMGDG 
MITFHISSKKSSTKT DSHRLGQH I EDALLDVAS L FQAGQHFKRR 
FRGSG KEN S RHRCG FLSRQ TG AS KASMTSTDF 


710S 


14 


1064 


GLQAGHPH PRSAS R I PE ADTH \ YS KLQRAFDS I VNKDHKRMFGT 
YFRVGFFGSKFGDLDEQEFVYKEPAITKLPEISHRLEAFYGQCF 
GAE FVEV I KDSTPVDKTKLDPNKAYIQ I TFVEP YFDE YEMKDRV 
TYFEKNFNLRRFMYTTPFTLEGRPRGELHEQYRRNTVLTTMHAF 
P Y I KTRIS VI QKEEFVLTP I EVAIEDMKKKTLQLAVAINQEPPD 
AKMLQMVLQGS VGATVNQG PLEVAQVFLAE I PADPKLYRHHNKL 
RLCFKE FI MR CGEAVEKNKRL I TADQRE YQQELKKNYNKLKBNL 
RPMIERKIPELYKPIFRVESQKRDSFHRSSFRKCETQLSQGS 


7107 


1145 


591 


*I*WLQTGKKK 
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Amino acid segment containing signal peptide 
{A-Alanine, OCysteine, D-Aspartic Acid, E« 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R.Arginine, 
S*Serine, T*Threonine, V=Valine, 
W=Tryptophan , Y=Tyrosine, X=*Unknown, *~Stop 
Codon, /-possible nucleotide deletion,. 
\-possible nucleotide insertion) 


7108 


1 


942 


VKVALLLTNLE Q PRTES E WENS FTLKM FLFQFVNLNSST F Y I AF 
FLGRFTGHPGAYLRL INRWRLEEC34PSGCLIDLCMQMGI I MVLK 
QTWNNFMELGYPLIQNWWTRRKVRQEHGPERKISFPQWEKDYNL 
QPMNAYGL FDE YLEM ILQFG FTTI FVAAFPIAPLLALLNNI IE I 
RLDAYKFVTQWRRPLAS RAKDIG I WYGILEGIGI LS VITNAFVI 
AITSDFIPRLVYAYKYGPCAGQGEAGQKCMVGYVNASLSVFRIS 
DFENRS EPESDGSEFSGTPLKYCR YRD YRDPPHS IiVPYG YTIiQF 
WHVLAW 


7109 


964 


102 


WDQRKRNSLVPGPAHGPAQEEPWEKKESLGAAQEALSIQLQPKE 
TQ P F P KS EQ VYLHFLS WTE DG PE PKDKGSLPQPPI TE VE S QVF 
SEKLATDTSTF EATS EGTLE LQQRN P KAERLRW S P AQEE S FRQM 
WIHKE I PTGKKDHECSECG KTFI YNSHLWHQRVHSGE KP YKC 
SDCGKTFKQSSNLC3QHQRIHTGEKPFECNECGKAFRWGAHLVQH 
QRIHSGEKPYECNECGKAFSQSSYLSQHRRIHSGEKPFICKECG 
KAYGWCSELIRHRRVHARKEPSH 


7110 


96 


697 


RLDNFSGFLVEVTKEERHI VKPIiYDRYRLVKQMLTRAS I TPVLG 
S PSTKRRGQMLQP 1 1 EGETAHFFEE I KEEEEDGVNLSSELGDML 
KTAVQVQSSLKNSESDVEENQEKLALDLRLSSSRAASMPELLEQ 
LWKARAEKKKLRKTIiREFEEAFYQQNGRNAQKEDRVPVLEEYRE 
Y KK I KAKLRLLE VL»I S KQD S S KS I 


7111 


2 


414 


GSG LYRGPT PGGQC I WKPNS MPPDHE RNFGFTQ FALELNE LTAE 
LKRSLPSTDTRLRPDQRYLEEGNIQAAEAQKRRIEQLQRDRRKV 
MEENN I VHQARFFRRQTDS SGKEWWVTNNTYWRLRAEPGYGNMD 
GAVLW 


7112 


103 


495 


PRC F P VADRGRL I GGL P DWT I MEGKTLNLTCTVFGNPD P EV I W 
FKNDQDIQLSEHFSVKVEQAKYVSMTIKGVTSEDSGKYS INIKN 
KYGGEKI DVTVS VYKHGE KI PDMAPPQQAKPKL I PAS AS AAGQ 


7113 


1 


824 


KCLRQAWHEAPSSLiAFTRWCSREERAEGGGNIiHRSITRDPKPPG 
LRPSQRPMDDKKKKRS P KP CLiAQPAQAPGTLRRVPVPTSHSGS L 
ALGLPHLPSP KQRAKFKRVGKEKGRP VLAGGGSGS AGTPLQHS F 
LTEVTDVYEMEGGLLNLLNDFHSGRLQAFGKECSFEQLEHVREM 
QEKIJU^LHFSLDVCGEEEDDEEEEDGVrEGLPEEQKKTMADRNli 
DQLL S NLG S CLGAL VPGGMRGGEGT YS QSH S WALGEKVG VHG S K 
SSGPLNLPRR 


7114 


3 


1492 


VWEVDEQIDHYKESQDKFLWQAAFIGKETLKDESGQECKICRKI 
IYLNTDFVSVKQRLPKYYSWERCSKHHLNFLGQNRSYVRKKDDG 
CKAYWKVCLHYNLHKAQPAERFFDPNQRGKALHQKQALRKSQRS 
QTGEKLYKCTECGKVFIQKANLWHQRTHTGEKPYECCECAKAF 
SQKSTLIAHQRTHTGEKPYECSECGKTFIQKSTLIKHQRTHTGE 
KP FVCDKCPKAFKSS YHL IRHEKTHIRQAFYKG I KCTTS S LI YQ 
RIHTS E KPQCSEHGKASDEKPS PTKHWRTHTKENI YECS KCGKS 
FRGKSHLSVHQRIHTGEKPYECSICX3KTFSGKSHLSVHHRTHTG 
EKPYECRRCGKAFGEKSTLIVHQRMHTGEKPYKCNECGKAFSEK 
SPLIKHQRIHTGERPYECTDCKKAFSRKSTLIKHQRIHTGEKPY 
KCS E CG KAFS VKS TL I VHHRTHTGEKP YE CRDCG KAFSGKST1»I 
KHQRSHTGDKNL 


7115 


1 


947 


NAAHGYNWGLWCMYI I P PQD WLDRGDE SAP I RT P AMIGCS FWD 
REYFGDIGLLDPGMEVYGGENVKLGMRVWQCGGSMEVLPCSRVA 
HIERTRKPYNNDIDYYAKRNALRAAEVW^DFICSHVYMAWNIPM 
SN PGVD FGD VS E RLALRQRLKCRS FKWYLENVYPEMRVYNNTLT 
YGE VRJNS KAS AYCLDQGAEDGDRAILY PCHGMS SQLVR YS ADGL 
LQLGPLGS TAFLPDS KCLVDDGTGRMPTLKKCEDVARPTQRLWD 
FTQSG P I VSRATGRCLE VEMSKDANFGLRLWQRCS GQKWM I RN 
WIKHARH 


7116 


866 


95 


RVRMRRNAE VI EEKLSMKS WAKFRPGEPWKG YPN I DPETDP YVT 
PG S V I NNLS I NTVRE VDHLRDRNSGS S S S LNTTLP S TS AWS SIR 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Histidine, I*Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
psproline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VsValine, 
W=Tryptophan, Y= Tyro sine , X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ASNYNVPLS STAQSTS ARNSDS KLTWSPGS VTNTSIiAHELWKVP 
LPPKNITAPSRPPPGLTGQKPPLSTWDNSPLRIGGGWGNSDARY 
TPGS SWGBSSSGRI TNW L VLKNLT P Q IDGSTLRTLCMQHG PL I T 
PHLNLPHGNALVRYSSKEE WKAQKSLHI SDLFLLTL 


7117 


695 


1261 


LLI STPGGCHPPPSS I E FTYTGAWGKALPAPHMPCAPGALPQGA 
FVS QAARAI P LLQP S QAAQ AEG LS Q P ARACG ALCSLP W PLRNWG 
S PI LRLPGGLRTPTNDRKTRTRS AMACWARAQWDTLGPLKLSHR 
GKVCLRHPRPTGVRGGPGAAGRQGGMGTRRRGTFTSGARDPGGL 
RVKHRCQPTGHLP 


7118 


49 


1863 


PHCEPNPGAGAMVLLHVLFEHAVGYALLALKEVEEISLLQPQVE 
ESVLNLGKFHSIVRLVAFCPFASSQVALENANAVSEGWHEDLR 
LLLETHLP S KKKKVL LGVGD P K I G AAI QE ELG YN CQTGG V I AE I 
LRGVRLHFHNLVKGLTDLSAC3<AQLGLGHSYSRAKVKFNVNRVD 
NMIIQSISLLDQLDKDINTFSMRVREWYGYHFPELVKIINDNAT 
YCRLAQFIGNRRELNEDKLEKLEELTMDGAKAKAILDASRSSMG 
MDISAIDLINIESFSSRWSLSEYRQSLHTYLRSKMSQVAPSLS 
ALI GEAVGARLI AHAGS LTNLAK Y PAS TVQ I LG AEKALFRALKT 
RGNTPKYGLIFHSTFIGRAAAKNKGRISRYLANKCSIASRIDCF 
SEVPTSVFGEKLREQVEERLSFYETGEIPRKNLDVMKEAMVQAE 
EAAAEITRKLEKQEKKRLKKEKKRLAALALASSENSSSTPEECE 
EMS EKP KKKKKQ KPQE VPQENGMEDP S I S FS KP KKKKS FS KEEL 
MSSDLEETAGSTSIPKRKKSTPKEETVNDPBEAGHRSGSKKKRK 
FSKEEPVSSGPEEAAGKSSSKKKKKFHKASQED 


7119 


49 


1863 


PHCEPNPGAGAMVLLHVLFEHAVGYALLALKEVEEISLLQPQVE 
ESVLNLGKFHS I VRL VAF C P FAS S Q VALENANAVS EGWHEDLR 
LliLETHLPSKKJCKVLIXSVGDPKIGAAIQEEIiGYNCQTGGVIAEI 
liRGVRLHFHNLVKGLTDLSACKAQLGLGHSYSRAKVKFNVNRVD 
NMIIQSISLLDQLDKDINTFSMRVREWYGYHFPELVKIINDNAT 
YCRLAQFIGNRRELNEDKLEKLEELTMDGAKAKAILDASRSSMG 
MDISAIDLINIESFSSRWSLSEYRQSLHTYLRSKMSQVAPSLS 
ALIGEAVGARLIAHAGSLTNLAKYPASTVQILG AEKALFRALKT 
RGNTPKYGLIFHSTFIGRAAAKNKGRISRYLANKCSIASRIDCF 
SEVPTSVFGEKLREQVEERLS FYETGEI PRKNLDVMKEAMVQAE 
EAAAEITRKLEKQEFCKRLKiCEKKRLAALALASSENSSSTPEECE 
EMSEKPKKKKKQKPQEVPQENGMEDPS I SFSKP KKKKS FS KEEL 
MS SDLEETAGS TS I PKRKKS T P KEET VND PEEAGHRSGS KKKRK 
FSKEEPVSSGPEEAAGKSSSKKKKKFHKASQED 


7120 


1991 


64 


QLGTRRCLRGDKVTNAMQDFLVTNLEPRFIEPQTANLSVVFKDS 
NSTTPL I FVLS PGTDPAADL YKFAEEMKFSKKLS AI S LGQGQGP 
RAEAMMRSSIERGKWVFFQNCHLAPSWMPALERLIEHINPDKVH 
RDFRLWLTSLPSNKFPVSILQNGSKMTIEPPRGVRANLLKSYSS 
LGEDFLNSCHKVMEFKSLLLSLCLFHGNALERRKFGPLGFNIPY 
EFTIX5DLRICISQLKMFLDEYDDIPYKVLKYTAGEINYGGRVTD 
DWDRRCIMNILEDFYNPDNOjSPEHSYSASGIYHQIPPTYDLHGY 
LSY IKSLPLNDM P E I FGLHDNAN IT FAQNETFALLGTI IQLQPK 
S SS AGSOGREE I VED VTQNI LLKVP E P I NLQWVMAKYP VLYEES 
MNTVLVQEVIRYNRLLQVITQTLQDLLKALKGLVVMSSQLELMA 
ASLYNNTVPELWSAKAYPSLKPLSSWVMDLLQRLDFLQAWIQDG 
I PAVF W I SGFFF PQAFLTGTLQNFARKF VI S IDT I S FDFKVM FE 
APS ELTQRPQVG CY I HGL FLEG ARWD P EAFQLAES QP KEL YTEM 
AVIWLLPTPNRKAQDQDFYLCPIYKTLTRAGTLSTTGHSTNYVI 
AVE I PTHQPQRHW I KRGVALI CALDY 


7121 


2 


546 


RPLR P WVLS LGS M VG LMT YGRRQ FQS LDTTMRRL I PP FREAS AK 
LTTLVDADAEAFTAYLEAMRLPKNTPEEKDRRTAALQEGLRRAV 
S VPLTLAETVAS LWPALQELARCGNLACRSDLQVAAKALEMGVF 
GAY FNVL I NLRD I TDEAFKDQ I HHR VS S LLQ EAKTQAAL VLDCL 
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Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I-Isoleucine, K^Lysine, 
L=Leucine, M*Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R^Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y*= Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








ETRQE 


7122 


2 


546 


R P LR P WVLS LGS MVGLMT YG RRQ FQS LDTTMRRLI P P FR EAS AK 
LTTLVDADAEAFTAYLEAMRLPKNTPEEKDRRTAALQEGLRRAV 
SVPLTLAETVASLWPALQELARCGNLACRSDLQVAAKALEMGVF 
GAYFNVLINLRD I TDEAFKDQIHHRVSSLLQEAKTQAALVLDCL 
ETRQE 


7123 


1 


1092 


KPAVPEARS AGTS EAGRSGAEE VS CGS VSGDGAAMRLTPRALCS 
AAQAAW REN FPLCGRD VAR WFPGHMAKGL KKMQS S L KLVD CUE 
VHD AR I PL SGRN P LFQE TLGLKPHLLVLNKMDLADLTEQQ K I MQ 
HLEGEGLKNVI FTNCVKDENVKQ I I PMVTELIGRSHRYHRKENL 
EYCIMVIGVPNVGKSSLINSLRRQHLRKGKATRVGGEPGITRAV 
M S KI QVSE RPLM FLLDT PG VLAPR I ES VETGLKLALCGTVLDHL 
VGEETMADYLLYTLNKHQRFGYVQHYGLGSACDNVERVLKSVAV 
KLGKTQKVKVLTGTGNVNVIQPNYPAAARDFLQTFRRGLLGSVM 
LDLD VLRGH PR V 


7124 


2 


382 


LPLTLLLAAPFAHLLLPPGHDQSPCWHPGPALSPGTLGPLSWAM 
ANSGLQ LLG YFLALGG WVG 1 1 ASTALP QWKQ S S YAGDAS I QLRS 
KVFVLESEWGGDSLGLPRDCGWSCLLHSAVRSEKGFWS 


7125 


166 


1127 


NCISEKRNYSFSMQKGKGRTSRIRRRKLCGSSESRGVNESHKSE 
F I ELRKWLKAR KFQDSNLAPACFPGTGRGLMSQTS LQEGQMI I S 
LPESCLLT\RDTVIRSYLGAYITKWKPPPSPLLALCTFLVSEKH 
AGHRSLLEA\YLEILPKAYTCPVCLEPEWNLLPKSLKAKAEEQ 
RAHVQEFFASSRDFFSSLQPLFAEAVDS I FS YSALLWAWCTVNT 
RAVYL \ S PGSGNAF IQS RTP VQLAP YLD LLNH S PHVQ VKAAFNE 
ETHS YE IRTTSRWRKHEEVFI CYGPHDNQRLFLE YGFVS VHNPH 
ACVYVSRGWNQLCS 


7126 


1 


733 


CRDMAAFIVPSPARRCSQKGSLGHLPTQPWLWAAMSPRGQERGT 
S HSQARE PQR PGRWLLGSLQS SPGTLGQAGTASRRRGCMVQRWV 
QVATGRRAVQVP KGALGLALGETS PGASRGMSGGAGGCWALGWA 
PSPVLPSWLLEGPPPWLSIISDSGTQRPSPRRCPARPSPWGPQC 
WRGGRIASAEAS ST* TPGSGSRARSGRRS PGSRRRSASAPS PTP 
PTDACA*SCVARPAGSRSSRPAAA 


7127 


1311 


277 


GLPAMCST* KAGYYE ETEGDC I PKDR* I E KRP FKE I * RR I PRI F 
AKQKQ I*S*NSQKI GAS E I DRGRKEAD CS DAP AAAR I GAVS VFR 
RSTQEARVSPRSNAKSANLRAVRAD * WEHFVLLFHTPEQFLAEC 
I CRST* * K*WHQLC* PLSSL*TGLKRKLLL* VLFRI * WLKDCDV 
* FCQ K I FATNFCNWQNLI Q * EE * KPVE YS VEN+ H I MNLLLPM+ L 
CQSSLRDQTIVTWRM*RNYSMFRINMISSL*DGSIHIPLKLHFY 
PALI FTLTVPINSCCQRPLPLFAHQS IKTLASSGS PMLACLRFL 
LVKKRAFIHTPRSPGCSV* CKHVLVKDNKNNCVGSEV 


7128 


2 


5228 


GRVDLWT I LLGRS ALRELSQ I EAELNKHWRRLLEGLS YYKPPS P 
SSAEKVKANKDVASPLKELGLRISKFLGLDEEQSVQLLQCYLQE 
DYRGTRDSVKTVLQDERQSQALILKIADYYYEERTCILRCVLHL 
LTYFQDERHPYRVEYADCVDKLEKELVSKYRQQFEELYKTEAPT 
WETHGNLMTERQVSRWFVQCLREQSMLLEI I FLYYAYFEMAPSD 
LL VLT KM F KEQG FG S RQ TNRHL VDETMD PFVDR I G Y FS AL I L VE 
GMDIESLHKCALDDRRELHQFAQDGLICQDMDCLMLTFGDIPHH 
AP VLLAWALLRHTLN P E ETS SWR KI GGTAI QLNVFQYLTRLLQ 
SLASGGNDCTTSTACMCVYGLLS FVLTSLELHTLGNQQDI IDTA 
CEVLADPSLPELFWGTEPTSGLGIILDSVCGMFPHLLSPLLQLL 
RALVSGKSTAKKVYSFLDKMSFYNELYKHKPHDVISHEDGTLWR 
RQTPKLLY PLGGQTNLRI PQGTVGQVMLDDRAYLVRWE YS YS SW 
TLFTCE I EMLLHWSTADVIQHCQRVKP I IDLVHKVI STDLS I A 
DCLLPITSRIYMLLQRLTTVISPPVDVIASCVNCLTVLAARNPA 
KVWTDLRHTGFLPFVAHPVSSLSQMISAEGMNAGGYGNLLMNSE 
Q PQGE YG VT I AFLRL I TTLVKGQ LG S TQSQG L VPC VMFVLiKEML 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G^Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M«Methionine, N=Asparagine , 
P^Proline, Q^Glutamine, R=Arginine, 
S*Serine, T=Threonine, V«Valine, 
W-Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 








PSYHKWRYNSHGVREQIGCLILELIHAILNLCHETDLHSSHTPS 
LQFLC I CSLAYTEAGQTVINIMGIGVDTIDMVMAAQPRSDGAEG 
QGO^QLLIKTVTOAFSVTNNVIRLKPPSNVVSPLEQALSQHGAH 
GNNL I AVLAK YI YHKHDPAZjPRLAI QLLKRLATVAPMS VYACLG 
ND AAAI RDAF LTRLQS K\ I E \ DMR I K\ VM I L \ E FLTVA\ VETQ P 
GLIELFLNLEVKDG\SDGSKEFSLGMW\SCLHAV/VWELIDSQQ 
QDRYWCPPLLHRAAIAFLHALWQDRRDSAMLVLRTKPKFWENLT 
SPLFGTLSPPSETSEPSILETCALIMKIICLEIYYVVKGSLDQP 
LKDTLKKFS I EKRFAYWSGYVKSLAVHVAETEGSSCTSLLE YQM 
LVSAWRMLLIIATTHADIMHLTDSWRRQLFLDVLDGTKALLLV 
PAS VNCLRLGSMKCTLLLI LLRQWKRELGSVDE I LGPLTE I LEG 
VLQADQQLMEKTKAKVFSAFITVLQMKEMKVSDIPQYSQLVLNV 
CETLQEEVIALFDQTRHSLALGSATEDKDSMETDDCSRSRHRDQ 
RDGVCVLGLHLAKELCEVDEDGDSWLQVTRRLPILPTLLTTLEV 
S LRMKQNLHFTEATLHLLLTLARTQQGATAVAGAGI TQS X CLPL 
LSVYQLSTNGTAQTPSASRKSLDAPSWPGVYRLSMSIiMEQLLKT 
LRYNFLPEALDFVGVHQERTLQCLNAVRTVQSLACLEEADHTVG 
FILQLSNFMKEWHFHLPQLMRDIQVNLGYLCQACTSFLHSRKML 
QHYLQNKNGDGLPSAV\AQRV\QRPPSAASAAPSSSKQPAADTE 
AS EQQALHTVQ YGLLKI LS KTLAALRH FTPDVCQ I LLDQS LDLA 
E YNFLFALS FTTPTFDSEVAPS FGTLLATVNVALNMLGELDKKK 
EPLTQAVGLSTQAEGTRTLKSLLMFTMENCFYLLISQAMRYLRD 
PAVHPRDKQRMKQELS SELSTLLSSLSRYFRRGAPS S PATGVLP 
S PQGKS TS LS KAS PESQE PL I QLVQAFVRHMQR 


7129 


1 


1054 


FRRFRWRRRLH+AGPASSAGGSPGEASGTMSGELPPNINIKEPR 
WDQSTFIGRANHFFTVTDPRNILLTNEQLESARKIVHDYRQGIV 
PPGLTENELWRAKYIYDSAFHPDTGEKMILIGRMSAQVPMNMTI 
TGCMMTFYRTTPAVLFWQWINQSFNAWNYTNRSGDAPLTVNEL 
GTAYVSATTGAVAT ALGLNALTKHVS P L I GRF VP FAAVAAANC I 
NI PLMRQRELKVG I PVTDENGNRLGES ANAAKQAITQVWSRI L 
MAAPGMAI P P FI MNTLEKKAFLKRFP WMS AP I QVGLVGFCLVFA 
TPLCCALFPQKSSMSVTSLEAELQAKIQESHPELRRVYFNKGL 


7130 


2 


780 


HEVPSLQTSDPLPGSVQRCSWVSQPNKENWCQDHLYNSLGRKG 
ISAKSQPYHRSQSSSSVLINKSMDSINYPSDVGKQQLLSLHRSS 
RCESHQDLLPDIADSHQOGTEKLSDLTLQDSQKWWNRNLPLN 
AQIATQNYFSNFKETDGDEDDYVEIKSEEDESELELSHNRRRKS 
DS KFVDADF S DNVCSGNT LHS LNS P RT P KKP VNS KLG LS P YLT P 
YNDSDKLNDYLWRGPSPNQQNIVQSLREKFQCLSSSSFA 


7131 


805 


573 


AAAEGH IEWKFLI EACKVNP FAKDRWGNT PLDDAVQFNHLE W 
KLLQDYQDSYTLSETQAEAAAEALSKENLESMV 


7132 


1420 


1087 


IDMLLLSGALVSGPYTLITTAVSADLGTHKSLKGNAHALSTVTA 
1 1 DGTGS VGAALGPLLAGLLS PSGWSNVFYMLMFADACALLFL I 
RLIHKELSCPGSATGDQVPFKEQ 


7133 


2 


3648 


QQI PGLLPAHGESGDALRKPRLQKP I TGHLDDLFFTL YPS LEKF 
E E E LLELHVQDHFQEG CGP LDGGALE I LERR LR VG VHNGLGFVQ 
RPQVWLVPEMDVALTRSASFSRKWSSSKTSSGSQALVLRSRL 
KJjir Ctrl w\jtixrJ\C /\v x r \j,LiCi x v r o trf\\j v iajim/vvo v x ouuiHiiftv-.i ui 
MVRWAVWNPLLEADSGRVTLPLQGGIQPNPSHCLVYKVPSASMS 
SEEVKQVESGTLRFQFSLGSEEHLDAPTEPVSGPKVERRPSRKP 
PTSPSSPPAPVPRVLAAPQNSPVGPGLSISQLAASPRSPTQHCL 
ARPTSQLPHGSQASPAQAQEFPLEAGISHLEADLSQTSLVLETS 
IAEQLQELPFTPLHAPIWGTQTRSSAGQPSRASMVLLQSSGFP 
EILDANKQPAEAVSATEPVTFNPQKEESDCLQSNEMVLQFLAFS 
R VAQD CRGT S W P KTVYFT FQ F YR FP PAT T PRLQLVQLDE AGQP S 
SGALTH I LV P VS RDGTFDAGS PG FQLR YMVGPG FLKPGERRCFA 
RYLAVQTLQIDVWDGDSLLLIGSAAVQMKHLLRQGRPAVQASHE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C-Cyeteine, D=Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I*=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEWATEYEQDNMWSGDMLGFGRVKPIGVHSWKGRLHLTLAN 
VGHPCEQKVRGCSTLP PSRS RVI SNDGAS R FSGGSLLTTGSSRR 
KHVVG^QKLADVDSE LAAMLLTHARQGKGPQDVS RESDATRRRK 
LERMRSVRLQEAGGDLGRRGTSVLAQQSVRTQHLRDLQVIAAYR 
E RT KAES I AS LLS LA I TTEHTLHATLG VAE FFE FVLKNP HNTQH 
TVTVE I DNPE L S VI VDS QE W RD F KG AAGLHTP VE EDM FHLRGS L 
APQLYLRPHETAHVP FKFQS FSAGQLAMVQASPGLSNEKGMDAV 
SPWKS SAVPTKHAKVLFRASGGKP I AVLCLTVELQPHWDQVFR 
FYHPELSFLKKAIRLPPWHTFPGAPVGMLGEDPPVHVRCSDPNV 
ICETQNVGPGEPRDIFLKVASGPSPEIKDFFVUYSDRWLATPT 
QTWQ VYLHSLQRVDVS CVAGQLTRLS LVLRGTQTVRKVRAFTSH 
PQELKTDPKGVFVLPPRGVQDLHVGVRPLRAGSRFVHLNLVDVD 
CHQLVAS WLVCLCCRQPLI S KAFE IMLAAGEGKGVNKR I TYTNP 
Y P S RRT FHLHS DHPE L LRFRE DS FQ VGGGE T YT I GLQ FAP S QR V 
GEEE ILI Y INDHEDKNBEAFCVKVI YQ 


7134 


2115 


1111 


GGEGFSYPPHVGLSLGTPLDPHYVLLEVHYDNPTYEEGLIDNSG 
LRLFYTMDIRKYDAGVIEAGLWVSLFHTIPPGMPEFQSEGHCTL 
ECLEEALEAEKPSGIHVFAVLLHAHLAGRGIRLRHFRKGKEMKL 
LAYDDDFDFNFQEFQYLKEEQTILPGDNLITECRYNTKDRAEMT 
WGGLS TRS EM CLS YLL Y YPR I NLTR CAS IPDIMEQLQFI G VKE I 
YRPVTTWPFI I KSPKQYKNLS FMDAMNKFKWTKKEGLS FNKLVL 
SLPVNVRCSKTDNAEWSIQGMTALPPDIERPYKAEPLVCGTSSS 
SSLHRDFS INLLVCLLLLS CTLSTKS L 


7135 


2 


2072 


FVPRVTPRSLSLQGPKGESVGSITQPLPSSYLIFRAASESDGRC 
WLDALELALRCS S LLRLGTCKP5RDGE PGTS PDAS PSS LCGLPA 
SATVHPDQDLFPLNGSSLENDAFSDKSERENPEESDTETQDHSR 
KTESGSDQSETPGAPVRRGTTYVEQVQEELGELGEASQVETVSE 
ENKSLMWTLLKQLRPGMDLSRWLPTFVLEPRSFLNKLSDYYYH 
ADLLS RAAVEEDAYSRMKLVLRWYLSGF YKKPKGI KKP YNP ILG 
ETFRCCWFHPQTDSRTFYIAEQVSHHPPVSAFHVSNRKDGFCIS 
GS ITAKS RFYGNS LSALLDGKATLTFLNRAED YTLTMP YAHCKG 
I L YGTMT LELGGKVT I ECAKNN FQ AQ LEFKLKP FFGGSTS I NQ I 
SGKITSGEEVLASLSGHWDRDVFIKEEGSGSSALFWTPSGEVRR 
QRLRQHTVP LE E QTELES ERLWQHVTRA I S KGDQHRATQE KFAL 
EEAQRQRARERQESLMPWKPQLFHLDPITQEWHYRYEDHSPWDP 
L KD I AQ FE QDG I LR TLQQEAVARQTT FLG S PG PRHERS G PDQRL 
RKASDQPSGHSQATESSGSTPESCPELSDEEQDGDFVPGGESPC 
PRCRKEARRLQALHEAILSIREAQQELHRHLSAMLSSTARAAQA 
PTPGLLQS PRSWFLLCVFLACQLFINHILK 


713 6 


2 


418 


D FVPS FRR P SGNTSQTVWLLRAATLEKEVAGLREKIHHLDDMLK 
SQQRKVRQMIEQLQNSKAVIQSKDATIQELKEKIAYLEAENLEM 
HDRMEHLIEKQISHGNFSTQARAKTENPGSIRISKPPSPKPMPV 

IKVVDl 


7137 


2 


466 


WASGMSTVPGGSRHSLGIQVRGGWGVTGGEEESLTVPVADTWQA 
GSFKVATQERNPQRAQMRLRRQKKGWPFLGDFLTELQRLDSAI 
PDDLDGNTNKRSKEVRVLQEMQLLQVAAMNYRLRPLEKFVTYFT 
RMEQLSDKESYKLSCQLEPENP 


7138 


2 


466 


WASGMSTVPGGSRHSLGIQVRGGWGVTGGEEESLTVPVADTWQA 
GSFKVATQERNPQRAQKRLRRQKKGVVPFLGDFLTELQRLDSAI 
PDDLDGimJKRSKEVRVLQEMQLljQVAAMNYRLRPLEKFVTYFT 
RMEQLSDKESYKLSCQLEPENP 


7139 


1 


357 


S LRNSARGLKMAASAARGAAALRRS INQPVAFVRRIPWTAAS S Q 
LKEHFAQFGHVRRCILPFDKETGFHRGLGWVQFSSEEGLRNALQ 
QENHI IDGVKVQVHTRRPKLPQTSDDEKKDF 


7140 


1401 


1357 


RASS LQ VLKAWGGL I PS S FQQQHTGQ YALEE LFDL KVYDCFCS F 
NMNVSLEKQLRPSQPWPRGKCRKTPGWBEARPKAQDLRGDLGKT 



602 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino a ri H e pfjnip n 1~ rnnhaininn eirmaT nanh -i /4a 
•miii.liiu o^>xu ocyutciiL uuiiuaj.nj.uy SlyllaJ. pcpClQc 

(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G*Glycine, 
H^Histidine, I=Isoleucine, K~Lysine, 
L=Leucine, M*Methionine, N=Asparagine, 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QAGPAEAHTRGPPRLPAATGCPPHLPGLLSGISVDIDPTGLQSQ 

n x c *\oyL/r zr uric o du i ^ rvo uucy x ni*nii ryj n 1 1 rt rv, j y vltt, 1 1 ] yyjfv r 

ADFMTNQCG 


7141 


124 


1073 


LDSRS CWLDMEDLEEDVRFI VDETLDFGGLS PSDSREEEDITVL 
VTPEKPLRRGLSHRSDPNAVAPAPQGVRLSLGPLSPEKLEEILD 
EANRLiAAOLEOCAT/'jnRF^ AGFfiT.flPRPWPQPRPPTTMrr vnon 

VRDLL.PTVNSLTRS TPS / LKQPDAS TPE * * * EGVSQGS PG YI WK 
EALQHEEGVTHLQSVPCIQKPSIFSS\SRSTPPVRGRAGPSGRA 
AASEETRAAKLRGAAAKSSCQLPIPSAIPRPASRMPLTSRSVPP 
GRGALPPDSLSTRKGLPRPSTAGHRVRESGHKVPVSQRLNLPVM 
GATRSNLQPP 


7142 


658 


839 


i-t x r juriiJiTriEiijfvJriijoo v i uniKHr biwi uijivr 1 oi_Ju JL r y In V jLuN.Li.Lj 
KK * S RAVGWWM CRT/ YS S DLQVG V I KPWLLLGS QDAAHDLDT 
LKKN KVTH I LNVA YG VENAF LS DFTYKS I S I LDL PETN I LS YFP 
ECFEFIEEAKRKDGWLVHCNA 


7143 


3 


773 


SLEMSSDGEPLSRMDSEDSISSTIMDVDSTISSGRSTPAMMNGQ 
GSTTS S S KN IAYNCCWDQCQACFNS S PDLADH IRS IHVDGQRGG 
VFVCLWKGCKVYNTPSTSQSWLQRHMLTHSGDKPFKCWGGCNA 
S FASQGGLARHVPTHFSQQNS S KVS SQP KAKEES PS KAGMNKRR 
KLKNKRRRSLARPHDFFDAQTLDAIRHRAICFNLSAHIESLGKG 
HSWFHSTVSILLFFQIKYKTLQKNISTIISKSLKI 


7144 


1 


988 


FRVi^QDGGPSPAEHSKAEESAGMEARFLGLPDAAGSSGPTPAR 
RCPAPRPAGVSYVIRDEVEKYNRNGVNALQLDPALNRLFTAGRD 
S 1 1 R I WS VNQHKQD P Y IAS ME HHTDWVND I VLCCNG KTL I S AS S 
DTT VKVWNAHKG FCMS TLR THKD YVKAliAYAKDKE LVAS AGLDR 
QIFLWDVNTLTALTASNNTVTTSSLSGNKDS I YSLAMNQLGTI I 
VSGSTEKVLRVWDPRTCAKiMKLKGOT^ 

GSSDGTIRLWSLGQQRC I AT YR VHDEGVW ALQVNDAFTHVYS GG 
RDRKIYCTDLRNPDIRVLICE 



TRADOCS: 1 4 1 6260. 1 (%CSK0 1 ! .DOC) 
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WHAT IS CLAIMED IS: 

1. An isolated polynucleotide comprising a nucleotide sequence selected from the 
group consisting of SEQ ID NO:l-1786 and 3573-5358, a mature protein coding portion 
of SEQ ID NO:l-1786 and 3573-5358, an active domain of SEQ ID NO:M786 and 
3573-5358, and complementary sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide hybridizes to the polynucleotide of claim 1 under stringent 
hybridization conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide has greater than about 90% sequence identity with the 
polynucleotide of claim 1 . 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 
operatively associated with a regulatory sequence that modulates expression of the 
polynucleotide in the host cell. 

1 0. An isolated polypeptide, wherein the polypeptide is selected from the group 
consisting of: 
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(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent 
conditions with any one of SEQ ID NO: 1-1786 and 3573-5358. 

11. A composition comprising the polypeptide of claim 1 0 and a carrier. 

12. An antibody directed against the polypeptide of claim 10. 

13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a 
complex with the polynucleotide of claim 1 for a period sufficient to form the complex; 
and 

b) detecting the complex, so that if a complex is detected, the 
polynucleotide of claim 1 is detected. 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions 
with nucleic acid primers that anneal to the polynucleotide of claim 1 under such 
conditions; 

b) amplifying a product comprising at least a portion of the 
polynucleotide of claim 1 ; and 

c) detecting said product and thereby the polynucleotide of claim 1 in 

the sample. 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the 
method further comprises reverse transcribing an annealed RNA molecule into a cDNA 
polynucleotide. 

16. A method for detecting the polypeptide of claim 10 in a sample, comprising: 
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a) contacting the sample with a compound that binds to and forms a 
complex with the polypeptide under conditions and for a period sufficient to form the 
complex; and 

b) detecting formation of the complex, so that if a complex formation 
is detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound 
complex is detected, a compound that binds to the polypeptide of claim 10 is identified. 

18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a 
cell, under conditions sufficient to form a polypeptide/compound complex, wherein the 
complex drives expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence 
expression, so that if the polypeptide/compound complex is detected, a compound that 
binds to the polypeptide of claim 10 is identified. 

19. A method of producing the polypeptide of claim 10, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected 
from the group consisting of a polynucleotide sequence of SEQ ID NO: 1-1 786 and 3573- 
5358, a mature protein coding portion of SEQ ID NO:l-1786 and 3573-5358, an active 
domain of SEQ ID NO:l-1786 and 3573-5358, complementary sequences thereof and a 
polynucleotide sequence hybridizing under stringent conditions to SEQ ID NO: 1-1786 
and 3573-5358, under conditions sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 



606 



WO 01/53312 



PCT/US00/34263 



20. An isolated polypeptide comprising an amino acid sequence selected from the 
group consisting of any one of the polypeptides SEQ ID NO:1787 -3572 and 5359-7144, 
the mature protein portion thereof, or the active domain thereof. 

21 . The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide 
array. 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO: 1-1786 and 3573-5358. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid 
array. 

24. The collection of claim 23, wherein the array detects full-matches to any one of 
the polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of 
the polynucleotides in the collection. 

26. The collection of claim 22, wherein the collection is provided in a computer- 
readable format. 

27. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 
and a pharmaceutical^ acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising an antibody that specifically 
binds to a polypeptide of claim 10 or 20 and a pharmaceutical^ acceptable carrier. 
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